ВІДЕОЗАПИС
AI & BigData Online Day 2020

2
Tracks
12
Hours of content
15
Videos
Data Science Solutions Track
09:30 – 10:00
09:30 – 10:00
Registration
10:00 – 10:45
10:00 – 10:45
From the Earth to the Moon: Lessons from the Space Race to Apply in Machine Learning (EN)
Diego Hueltes
The space race was a EEUU – Soviet Union competition to conquer the space. This competence helped to develop space technology in an incredible manner, developing other derivative technologies as a side effect. This race was full of success in both sides, achieving goals that seemed impossible in record time. From this space race we can learn some lessons that we can apply to our Machine Learning projects to have a bigger success rate in a limited amount of time.
10:45– 11:30
10:45– 11:30
Practical Explainability for AI - with examples (EN)
Tsvi Lev
The explainability of AI models is becoming more than a 'nice to have' feature – it is essential for a model to be used and trusted. I will cover the main criteria and evolution of explainablity as a requirement, and make the case that it is also closely related to customer satisfaction and to the achieving the elusive goal of continuous improvement in model performance. Practical examples from computer vision and medical AI diagnostics will be covered.
11:30 - 12:15
11:30 - 12:15
Integrating Small Data, Synthetic Data in AI (RU)
Andrey Golub
Why "Small Data" is important for advancing the AI? Will Big Data models work correctly, if trained on synthetic data? Can we properly define and train a Neural Network from a limited dataset, by augmenting its data with an extra knowledge (meta-data)? This talk is about the role of small data in the future of AI. Efforts have already begun in this direction. Although the current mantra of deep learning says 'you need big data for AI', more often than not, AI becomes even more intelligent and powerful if it has the capability to be trained with small data. Some AI solutions that rely only on small data outperform those working with big data. Some other AI solutions use big data to learn how to take advantage of small data. Presenter will illustrate on practical and visual examples from Fashion Retail sector, some approaches and methods for using Small and Synthetic Data for training of the AI and Big Data systems, starting from the "big difference" but common nature of Small Data and Big Data. It will be explained in details the case of a Visual Search service, who's training is initially based on 3D CAD models and their meta data, combined with real pictures on the later stages of training.
12:15 - 13:00
12:15 - 13:00
Huawei AI computing in the service of industry and education (RU)
Goran Licanin
AI becomes our an everyday supporter. From YouTube content to fault and maintenance prediction for aeroplane engines. And today, market needs and technology limits are putting a lot of challenges in front of the AI Vendors having society challenging the new technologies adoption calling in help ethics. Huawei is one of the vendors providing E2E AI solution. Something that we are calling All-Scenario Full Stack AI Computing Portfolio. From chipset and AI enablement in the device to EDGE and DC AI Clusters. From CANN libraries giving you access to chipsets to MindSpore and ModelArt, providing the platform for full AI development life cycle. In this presentation, you will understand the Huawei AI strategy and how Huawei is working with industries and education all around the world.
1. AI is adopting in the society in the high speed and reason are heterogenous computing, 5G , increased and more accessible computing power and new AI algorithms
2. We are on the edge of innovation explosion and national economies are having huge expectation in terms GDP growth based on AI adoption
3. Huawei has AI strategy to answer on all these expectations and challenges and full AI technology stack providing AI developers and integrators with e2e capabilities, both hardware and software
4. AI is transforming industry and is part of our everyday life and Huawei is part of it - examples.
13:00 - 13:45
13:00 - 13:45
Load up on GANs, bring your friends (RU)
Michael Konstantinov
1. Can a neural network train another neural network? Living in the world of Generative Adversarial Networks. What types of GANs exist?
2. How powerful are modern GANs or can you tell the truth? Deep Fakes and Turing Test
3. Image-to-Image machine translation. Translation in CV is as cool as NLP or even cooler.
4. What next? Future of GANs, Normalizing Flow. How and where to study GANs?
13:45 - 15:15
13:45 - 15:15
Lunch
15:15 - 16:00
15:15 - 16:00
Аугментація зображень, як ми робили albumentations (UA)
Eugene Khvedchenia
16:00 - 16:45
16:00 - 16:45
Fairness of machine learning models – an overview and practical considerations (EN)
Ramon van den Akker
After major incidents, such as the Cambridge Analytica scandal and the alleged racial bias in the COMPAS system that assessed potential recidivism risk in the US, the call for responsible AI frameworks increased. Books as weapons of math destruction, the black box society, automating inequality, and against prediction also helped to create awareness about (unintentional) adverse effects of AI and data science systems. Over the past few years, major technology companies have published ethical guidelines on the use of data science and AI. In addition, governance bodies and governments have proposed high-level principles. The Ethics guidelines for trustworthy AI by the high-level expert group on AI is an important and leading example in Europe. These frameworks consist of high-level principles which typically include principles as `transparency', 'explainability', and `fairness'. In this talk we will focus on the implementation of the `fairness' principles. We will discuss steps in the data science workflow in which choices that affect the (un)fairness of a model need to be made, the monitoring of (un)fairness, and the construction of models that are fair-by-design
16:45 - 17:30
16:45 - 17:30
Model Validation Tips and Tricks to Ensure AI System Quality (EN)
Olivier Blais
17:30 - 18:15
17:30 - 18:15
Deep learning on mobile (EN)
Siddha Ganju
Over the last few years, CNNs have risen in popularity, especially in the area of computer vision. Many mobile applications running on smartphones and wearable devices would potentially benefit from the new opportunities enabled by deep learning techniques. However, CNNs are by nature computationally and memory intensive, making them challenging to deploy on a mobile device. Siddha explains how to practically bring the power of convolutional neural networks and deep learning to memory- and power-constrained devices like smartphones. You'll learn various strategies to circumvent obstacles and build mobile-friendly shallow CNN architectures that significantly reduce the memory footprint and make them easier to store on a smartphone. They also dive into how to use a family of model compression techniques to prune the network size for live image processing, enabling you to build a CNN version optimized for inference on mobile devices. Along the way, you'll learn practical strategies to preprocess your data in a manner that makes the models more efficient in the real world.
18:15 - 19:00
18:15 - 19:00
30 Golden Rules of Deep Learning Performance (EN)
Anirudh Koul
"Watching paint dry is faster than training my deep learning model."
"If only I had ten more GPUs, I could train my model in time."
"I want to run my model on a cheap smartphone, but it's probably too heavy and slow."

If this sounds like you, then you might like this talk.

Exploring the landscape of training and inference, we cover a myriad of tricks that step-by-step improve the efficiency of most deep learning pipelines, reduce wasted hardware cycles, and make them cost-effective. We identify and fix inefficiencies across different parts of the pipeline, including data preparation, reading and augmentation, training, and inference.

With a data-driven approach and easy-to-replicate TensorFlow examples, finely tune the knobs of your deep learning pipeline to get the best out of your hardware. And with the money you save, demand a raise!

19:00 - 19:15
19:00 - 19:15
Conference Track Closing
Machine Learning Track
09:30 – 10:00
09:30 – 10:00
Registration
10:00 – 10:45
10:00 – 10:45
From the Earth to the Moon: Lessons from the Space Race to Apply in Machine Learning (EN)
Diego Hueltes
The space race was a EEUU – Soviet Union competition to conquer the space. This competence helped to develop space technology in an incredible manner, developing other derivative technologies as a side effect. This race was full of success in both sides, achieving goals that seemed impossible in record time. From this space race we can learn some lessons that we can apply to our Machine Learning projects to have a bigger success rate in a limited amount of time.
10:45 – 11:30
10:45 – 11:30
Financial ML != ML and Finance (UA)
Alex Honchar
It's hard to find a field where machine learning didn't make its impact already. But despite all the buzzwords and marketing, machine learning in finance is still an open case. The main reason is that machine learning practitioners treat financial data as it is regular routinely practice in computer vision or NLP, not taking into account the stochastic underlying nature of the data and related decisions. This deep dive presentation shows on real examples the top-mistakes that practitioners do with financial time series, how these mistakes can be fixed, and how drastically change the related results and whole research process.
11:30 - 12:15
11:30 - 12:15
Multispectral images inpainting through GAN: Creating new images to restore corrupts (EN)
María José Meneses
Multispectral images content a big amount of useful information that can be processed and analyzed for different applications, however, processing this type of images could exclude valuable information. Hereby, it is proposed an inpainting method based on Generative Adversarial Networks. This process is a method of non-supervised learning and it is able to generate visual acceptable results through multispectral images processing with corruptions of random size and location.
12:15 - 13:00
12:15 - 13:00
Challenges faced with machine learning in practice (EN)
Jan van der Vegt
13:00 - 13:45
13:00 - 13:45
Workflows for an Enterprise Data Science Experience on Kubeflow (EN)
Yannis Zarkadas, Stefano Fioravanzo
Lesson 1: GitOps and Declarative Infrastructure Revisit the declarative nature of Kubernetes and apply GitOps best practices to get immutable, trackable and reproducible infrastructure. Deploy and manage Kubeflow using the GitOps methodology.

Lesson 2: Reproducible Pipelines with Kale Follow the steps of a data scientist deploying their pipelines in a secure and isolated manner. Try out an end-to-end user workflow right out of your Jupyter Notebook, by leveraging Kale, the easiest way to go from Notebook to Pipeline.
13:45 - 15:15
13:45 - 15:15
Dinner
15:15 - 16:45
15:15 - 16:45
Datarobot-DRUM. A machine learning framework to deploy your models. (Workshop) (EN)
Thodoris Petropoulos
- Introduction to DRUM
- Some slides to showcase
- Exercises on DRUM
16:45 - 17:30
16:45 - 17:30
Natural language understanding at scale with Spark NLP (EN)
David Talby
NLP is a key component in many data science systems that must understand or reason about text. This session introduces the Spark NLP library – the world's most widely used NLP library in the enterprise, which provides state-of-the-art accuracy, speed, and scalability for language understanding. We'll cover common NLP tasks: named entity recognition, sentiment analysis, spell checking and correction, document classification, and multilingual support. The discussion includes the latest advances in deep learning and transfer learning used to tackle it, including the prebuilt use of BERT embeddings within Spark NLP and 'post-BERT' research results like XLNet, ALBERT, and BioBERT. Spark NLP builds on the Apache Spark and TensorFlow ecosystems, and as such it's the only open-source NLP library that can natively scale to use any Spark cluster and take advantage of modern hardware platforms.
17:30 - 18:15
17:30 - 18:15
Deep learning on mobile (EN)
Siddha Ganju
Over the last few years, CNNs have risen in popularity, especially in the area of computer vision. Many mobile applications running on smartphones and wearable devices would potentially benefit from the new opportunities enabled by deep learning techniques. However, CNNs are by nature computationally and memory intensive, making them challenging to deploy on a mobile device. Siddha explains how to practically bring the power of convolutional neural networks and deep learning to memory- and power-constrained devices like smartphones. You'll learn various strategies to circumvent obstacles and build mobile-friendly shallow CNN architectures that significantly reduce the memory footprint and make them easier to store on a smartphone. They also dive into how to use a family of model compression techniques to prune the network size for live image processing, enabling you to build a CNN version optimized for inference on mobile devices. Along the way, you'll learn practical strategies to preprocess your data in a manner that makes the models more efficient in the real world.
18:15 - 19:00
18:15 - 19:00
30 Golden Rules of Deep Learning Performance (EN)
Anirudh Koul
"Watching paint dry is faster than training my deep learning model."
"If only I had ten more GPUs, I could train my model in time."
"I want to run my model on a cheap smartphone, but it's probably too heavy and slow."

If this sounds like you, then you might like this talk.

Exploring the landscape of training and inference, we cover a myriad of tricks that step-by-step improve the efficiency of most deep learning pipelines, reduce wasted hardware cycles, and make them cost-effective. We identify and fix inefficiencies across different parts of the pipeline, including data preparation, reading and augmentation, training, and inference.

With a data-driven approach and easy-to-replicate TensorFlow examples, finely tune the knobs of your deep learning pipeline to get the best out of your hardware. And with the money you save, demand a raise!

19:00 - 19:15
19:00 - 19:15
Conference Track Closing