Player is loading...

Embed

Embed code

Transcriptions

Note: this content has been automatically generated.
00:00:02
was so yeah i am so everybody's talking about it obviously eh um
00:00:07
and we collecting more and more of it if for a study by in c. for 'cause the by twenty twenty will have
00:00:12
like thirty five that abides of machine generated data but the essence of collecting data is getting inside out of it
00:00:19
and that's we're introducing new applications is application sort of exploratory nature which means
00:00:24
that they are dynamic so we don't know a priori what kind of
00:00:27
workload work on that the sort order banning gone and also that the queries
00:00:30
depend on the data as well as the result of prior queries
00:00:35
examples have thought such applications are um uh scientists exploration applications such as this remembering
00:00:40
project or astronomical observation experiments as well as modern in terms of things application
00:00:44
what the user does not know exactly searching for what he's looking for interesting part
00:00:49
and these are the the increasing data collections as well as these exploratory nature of modern applications
00:00:56
creates new challenges for data processing system specifically the user
00:00:59
wants instant access to data so multi processing time
00:01:02
also the the uh the product either the user relies on
00:01:06
the interactive query response time and finally increasing data size
00:01:11
also increases the the requirements for storage and computation applications
00:01:15
so coming up with cost efficient storage interpretation solutions is another charge
00:01:20
with the face but let us see why these actually problem
00:01:23
so a conventionally in order to explore date out scientists to use databases
00:01:29
and however introduction to start quitting you have the first load anything's
00:01:32
the beta indices are essentially a and read on the end of
00:01:36
the lady that structures which make known data access parts faster
00:01:41
eh it however eh the what what is the trouble is that in actually to
00:01:45
load in in this data this preparing the preparing step is very time consuming
00:01:49
on the right hand side you actually see the can will that execution
00:01:52
time for a executing game i interned of things work load
00:01:57
and what we see is that um see how the user has actually to load anything so they dubbed the
00:02:02
grey line which shows agree time is uh is starts up high since it contains the processing time
00:02:09
a a in my in my research uh uh yeah in my research what i do is actually um try it it
00:02:15
it would it and into into it you enable the existing
00:02:19
query engines with interactive capability a interact the expiration capabilities
00:02:24
by taking advantage of the underlying data distributions and i uh adapting to the work we work out why run queries
00:02:30
you know what she's that i eh develop online tuning algorithms which a body in a build
00:02:36
overlay data structures as byproduct of quick secretion and also by reducing the result decision requirements
00:02:44
eh tool by by doing that we actually remove the
00:02:46
requirement for preprocessing we reduce the storage overhead
00:02:50
and we enable the the the user to explore data efficiently while

Share this talk: 


Conference program

Welcome address
Andreas Mortensen, Vice President for Research, EPFL
7 June 2018 · 9:49 a.m.
Introduction
Jim Larus, Dean of IC School, EPFL
7 June 2018 · 10 a.m.
The Young Software Engineer’s Guide to Using Formal Methods
K. Rustan M. Leino, Amazon
7 June 2018 · 10:16 a.m.
Safely Disrupting Computer Networks with Software
Katerina Argyraki, EPFL
7 June 2018 · 11:25 a.m.
Short IC Research Presentation 2: Gamified Rehabilitation with Tangible Robots
Arzu Guneysu Ozgur, EPFL (CHILI)
7 June 2018 · 12:15 p.m.
Short IC Research Presentation 3: kickoff.ai
Lucas Maystre, Victor Kristof, EPFL (LCA)
7 June 2018 · 12:19 p.m.
Short IC Research Presentation 5: CleanM
Stella Giannakopoulo, EPFL (DIAS)
7 June 2018 · 12:25 p.m.
Short IC Research Presentation 6: Understanding Cities through Data
Eleni Tzirita Zacharatou, EPFL (DIAS)
7 June 2018 · 12:27 p.m.
Short IC Research Presentation 7: Datagrowth and application trends
Matthias Olma, EPFL (DIAS)
7 June 2018 · 12:31 p.m.
Short IC Research Presentation 8: Point Cloud, a new source of knowledge
Mirjana Pavlovic, EPFL (DIAS)
7 June 2018 · 12:34 p.m.
Short IC Research Presentation 9: To Click or not to Click?
Eleni Tzirita Zacharatou, EPFL (DIAS)
7 June 2018 · 12:37 p.m.
Short IC Research Presentation 10: RaaSS Reliability as a Software Service
Maaz Mohiuddlin, LCA2, IC-EPFL
7 June 2018 · 12:40 p.m.
Short IC Research Presentation 11: Adversarial Machine Learning in Byzantium
El Mahdi El Mhamdi, EPFL (LPD)
7 June 2018 · 12:43 p.m.
20s pitch 1: Cost and Energy Efficient Data Management
Utku Sirin, (DIAS)
7 June 2018 · 2:20 p.m.
20s pitch 2: Gamification of Rehabilitation
Arzu Guneysu Ozgur, EPFL (CHILI)
7 June 2018 · 2:21 p.m.
20s pitch 4: Neural Network Guided Expression Transformation
Romain Edelmann, EPFL (LARA)
7 June 2018 · 2:21 p.m.
20s pitch 5: Unified, High Performance Data Cleaning
Stella Giannakopoulo, EPFL (DIAS)
7 June 2018 · 2:21 p.m.
20s pitch 6: Interactive Exploration of Urban Data with GPUs
Eleni Tzirita Zacharatou, EPFL (DIAS)
7 June 2018 · 2:22 p.m.
20s pitch 7: Interactive Data Exploration
Matthias Olma, EPFL (DIAS)
7 June 2018 · 2:22 p.m.
20s pitch 8: Efficient Point Cloud Processing
Mirjana Pavlovic, EPFL (DIAS)
7 June 2018 · 2:23 p.m.
20s pitch 9: To Click or not to Click?
Eleni Tzirita Zacharatou, EPFL (DIAS)
7 June 2018 · 2:24 p.m.
20s pitch 10: RaaSS Reliability as a Software Service
Maaz Mohiuddlin, LCA2, IC-EPFL
7 June 2018 · 2:24 p.m.
20s pitch 11: Adversarial Machine Learning in Byzantium
El Mahdi El Mhamdi, EPFL (LPD)
7 June 2018 · 2:24 p.m.
Machine Learning: Alchemy for the Modern Computer Scientist
Erik Meijer, Facebook
7 June 2018 · 2:29 p.m.