Skip to main content Scroll Top

BIG DATA – SYNCHRONOUS

This is a free program of 2 ECTS credits that will require that the student attends synchronously online for 16 hours of classes (see time schedule below), and additionally perform some autonomous work at home.

Each participant who completes this course, either synchronously or asynchronously, will receive a personalised certificate from EIT.

Summary of the course

This introductory course on Big Data offers a concise overview of the fundamental concepts and technologies in large-scale data management. It covers the 5 V’s of Big Data, the role of Hadoop, Spark, NoSQL databases, and Data Lakes, as well as basic approaches to batch and streaming processing. Participants are introduced to core methods of data analysis and visualization, alongside critical discussions on governance, privacy (GDPR/LGPD), and ethical issues. Designed as a foundation, the course provides essential knowledge for students and professionals preparing to advance in the field of Big Data.

Program OF THEMES
  • What is Big Data? The 5 V’s (Volume, Velocity, Variety, Veracity, Value)
  • Challenges and Opportunities of Big Data
  • Data vs. Information vs. Knowledge
  • Big Data in the Current Context (trends, use cases)
  • Overview of the Hadoop Ecosystem (HDFS, YARN, MapReduce)
  • Introduction to NoSQL Databases (MongoDB, Cassandra, Neo4j – concepts and uses)
  • Real-Time Processing Tools (Kafka, Spark Streaming – introduction)
  • Concepts of Data Lakes and Data Warehouses
  • Hadoop Distributed File System (HDFS): Architecture and Basic Operations
  • Apache Spark: Core Concepts (RDDs, DataFrames, Spark SQL) and Applications
  • Batch vs. Stream Processing: Differences and Use Cases
  • Scalability and Fault Tolerance Challenges
  • Principles of Large-Scale Data Analysis
  • Introduction to Python for Data Analysis (Pandas, NumPy – brief overview)
  • Machine Learning in Big Data (overview of algorithms, examples)
  • Data Visualization Tools (Tableau, Power BI, or similar – concept introduction)
  • Data Quality and Governance in Big Data Environments
  • Data Security and Privacy (GDPR/LGPD and other regulations)
  • Ethics and Responsibility in the Use of Big Data
  • Compliance and Audit Challenges
  • Copyright, database rights and licensing schemes
  • Assessment of strategies
  • Data collection and visualization
  • Statistical analysis
  • Examples of enhancement in organizations strategies

Time schedule (all times in CET)

Day Hours
November 4
17:30-19:30
November 6
17:30-19:30
November 11
17:30-19:30
November 13
17:30-19:30
November 18
17:30-19:30
November 24
17:00-18:30
November 25
17:00-18:30
November 27
17:00-18:30
November 28
17:00-18:30

TRAINERS

Filipe Madeira

Themes 1, 2, 3, 4 and 5

Sveva Ianese

Theme 6

Alessio Chisari

Theme 7

Tomás Matos

Themes 1, 2, 3, 4 and 5