NUS
 
ISS
 

Data Preparation in Action

Prepare the data well so models and analysis are accurate

Overview

Part of -
Duration 5 days
Course Time
Enquiry Please contact ask-iss@nus.edu.sg for more details.

The main purpose of this 5-day course is that better data preparation leads to more reliable AI models and  better business outcomes.

In many analytics and AI projects, the biggest challenge is not just about building the model, they fail when the data is messy, incomplete, biased, poorly structured, or not machine-readable. Missing values, inconsistent formats, duplicated records, outliers, fragmented sources, and unstructured inputs can weaken model performance, slow down projects, and reduce trust in business insights.

The course focuses on real workplace data challenges. You will work with tabular datasets as well as multimodal data such as text, images, and audio. Through guided exercises and applied workshops, you will practise handling missing values and outliers, merging and reshaping datasets, creating useful features, reducing dimensionality, and converting unstructured data into numerical representations that machine learning models can use.

Participants will learn how to use modern AI tools to accelerate data preparation process while applying the validation, judgement, and quality checks needed for trustworthy analytics and AI outcomes.

By the end of the course, you will be better equipped to turn messy and complex data into high-quality, modelling-ready datasets. This will help you reduce rework, improve model reliability, support better predictive performance, and generate insights that are more useful for business decisions.

Key Takeaways

By the end of the course, participants will be expected to be able to

  • Work with tabular data to perform advanced transformations, merging, inspection, cleaning, and preparation for analysis.
  • Apply exploratory data preprocessing techniques to handle missing values and outliers using both statistical and machine learning-based methods, supported by AI tools.
  • Use feature creation techniques and dimensionality reduction methods, including PCA, to simplify datasets while preserving key information and improving model performance.
  • Convert and process text, image, and audio data into numerical representations suitable for machine learning using AI-assisted tools.
  • Integrate data processing, machine learning, and GenAI techniques to solve real-world business case studies.



Who Should Attend

This course is suitable for data professionals and technical practitioners who work with data preparation, processing, and analysis for AI and machine learning applications. This includes data analysts, data scientists, data engineers, and AI practitioners who want to strengthen their skills in tabular data handling, preprocessing, and multimodal data integration.

It is also relevant for software engineers, business analysts, and digital transformation teams who are involved in building or supporting data-driven solutions and want to gain practical experience with real-world data workflows and AI-enabled data processing techniques.

Pre-requisites
Participants should have prior experience in Python programming and foundational knowledge of machine learning models. Familiarity with basic linear algebra is beneficial.

Participants should also be comfortable using AI-assisted tools to support coding and analysis.

What to Bring

No printed copies of course materials are issued.
Participants must bring their internet-enabled computing device (laptops, tablet etc) with power charger to access and download course materials.

If you are bringing a laptop, please see below for the tech specs:

Minimum

Recommended

Operating Systems

• Windows 7, 8, 10 or
• Mac OS

Laptop running the latest
version of either Windows or
Mac OS

System Type

32-bit

64-bit

Memory

8 GB RAM

16+ GB RAM

Hard Drive

256 GB disk size

Others

• An internet connection – broadband wired or wireless
• Installation permissions (non-company laptops)
• Keyboard
• Mouse/Trackpad
• Display
• Power adapter (laptop battery might run out)

DirectX 10 graphics card for graphics hardware acceleration

 



What Will Be Covered

  • Working with tabular data, including advanced table transformations, merging datasets, and data cleaning techniques - Participants will also explore data inspection, reshaping, and preparing structured datasets through lectures, discussions, and real-world case studies.
  • Exploratory data preprocessing, with emphasis on handling missing values and outliers using both statistical and machine learning-based methods - Participants will apply these techniques in hands-on workshops to strengthen data quality and preprocessing skills.
  • Introduction of dimensionality reduction techniques, including feature creation, feature reduction, and Principal Component Analysis (PCA) - Participants will gain practical experience in simplifying datasets while preserving key information through case-based learning and workshops.
  • Focus on multimodal data inputs, including text, image and audio data - Participants will learn how to extract, clean, and integrate structured and unstructured data for analytics and AI applications.
  • AI-enabled preparation of text and image data, including sentiment features, summarisation-assisted labelling, translation for multilingual datasets, image feature extraction, and the use of GenAI to support data processing workflows - Participants will complete an integrated case study that connects data preparation decisions to model quality and business outcomes.



Fees & Subsidies

Fees for 2025
  Full Fee Singaporeans & PRs
(self-sponsored)
Full course fee S$4500 S$4500
ISS Subsidy  - (S$450)
Nett course fee S$4500 S$4050
9% GST on nett course fee S$405 S$364.50
Total nett course fee payable, including GST S$4905 S$4414.50
Note:
  1. All fees and subsidies are valid from January 2024, unless otherwise advised.
  2. All self-sponsored Singaporeans aged 25 and above can use their SkillsFuture Credit to pay for course fees. For more information about SkillsFuture Credit, click here.
  3. From 1st January 2024, the GST will be increased to 9%.



loading

Certificate

The ISS Certificate of Completion will be issued to participants who have attended at least 75% of the course and pass the required assessments.




Preparing for Your Course

NUS-ISS Course Registration Terms and Conditions

Find out more.

NUS-ISS and Learner’s Commitment and Responsibilities

Find out more.

WIFI Access

WIFI access will be made available to participants.

Venue

NUS-ISS
25 Heng Mui Keng Terrace
Singapore 119615

Click HERE for directions to NUS-ISS

In the event of a change of venue, participants are advised to refer to the acceptance email sent one week prior to the commencement date.

Course Confirmation

All classes are subject to confirmation and NUS-ISS will send an acceptance email to participants one week prior to the commencement date. Confirmed registrants are to attend and complete all lectures, class exercises, workshops and assessments (where applicable). Additionally, all responses to feedbacks and surveys conducted by NUS-ISS and its partners must be submitted. All training and assessments will be delivered as described in the course webpage.

General Enquiry

Please feel free to write to ask-iss@nus.edu.sg if you have any enquiry or feedback.




Course Resources

Develop your Career in the Following
Training Roadmap(s)

Please click on the discipline(s) to view the training roadmap of related courses to assess your training needs and goals.

Data Science

Driving business decisions using insights from Data

Read More Data Science

You Might be Interested in...

A+
A-
Scrolltop
More than one Google Analytics scripts are registered. Please verify your pages and templates.