Development Environment in Data Science

Data Science
Engineering and development
Open Source
On-site courses
Remote/Virtual
Face-to-face
English
French

Objectives

Understand the data science ecosystem and know the tools related to the realization of a data science project.

Prerequisite:

Comfortable with IT tools, Internet connection available.

Educational and technical material and resources:

  • Sessions with the trainer
  • Teaching aids in digital format
  • Alternating between theory and practice

Assesment:

Practical application and exercises, on-the-spot assessment of training.

Expected results & skills at the end of the training:

At the end of this course, participants will have a clear idea of what data science is, the tools available for implementing data science projects, which programming language to choose and how to organise their work.

Program

DAY 1

  • The unix environment, interacting with a shell, open source tools (sed, awk, grep, jq, csvkit, etc.), R and Python, SQL and NoSQL
  • Revision control and collaborative work with Git
  • The methodology for managing a data science project
  • Software engineering fundamentals and best practices

DAY 2

  • Information gathering and processing (experimental designs and clinical trials, surveys and polls, web data, open data)
  • Distributed architecture and database, map-reduce, big data, Apache Spark
Duration
14 hours
Level
Beginner
Audience
Anyone who wants to discover the data science ecosystem.
Participants
8 people maximum
Nous consulter pour un devis personnalisé.

Are you looking for information about a training course?

You want to set up a customized training session?

Contact our pedagogical team!