Player is loading...

Embed

Copy embed code

Transcriptions

Note: this content has been automatically generated.
00:00:00
hello everyone
00:00:02
i mean is korean kid to the give me about came we vacation from the show language
00:00:08
on the web and the first thing you might be asking yourself says what is the show language
00:00:14
and the line which is a form of communication
00:00:19
that doesn't rely and just text and it's actually everywhere around us so
00:00:24
banners and posters and charts and plots info graphics even size on the street
00:00:30
and this form of communication it's very native to all of us sometimes transcends languages
00:00:36
but but useful organising information so our focus of my team uh the cool
00:00:43
for the last couple of years has been focusing on
00:00:46
making a a i interact well with these types of a
00:00:51
communication and the applications i talk about today is how to
00:00:56
use these leverage this for claim refutation so sure to be
00:01:02
yeah in fact that's uh that's you know the relationship between
00:01:07
the two things is kind of complicated generative models have been
00:01:12
trained not to be faithful to let's say uh information we
00:01:18
might consider accurate but instead to produce content that seems plausible
00:01:24
so when we want to focus on a joy to you and papal yes
00:01:30
two sources that's when it comes tricky so the approaches that have been use
00:01:35
and this uh and then i reference them stop the under as a talk
00:01:40
um mention before so thanks much for that ah the approach is that the
00:01:46
community or the people building proscribed in taking a are around retrieving information from the
00:01:52
web and then trying to make this jenny i'm always attribute to it and
00:01:56
this is becoming a go to solution because it can enhance the accuracy of models
00:02:03
but the source of this information may internet large is and i would you fill in the blank here it's a
00:02:10
complicated place uh because there's lot of source of information
00:02:13
the um the use their country victory uh which is a
00:02:17
good thing but it also makes it very tricky problem
00:02:21
and it's also very unstructured so we might find any evidence
00:02:26
to bury five some claim in some format here is anything from and they are how do i know these things are
00:02:32
telling you the same thing there's some simple cases like just surface form
00:02:37
of a name of a person that there's much more complicated things like
00:02:41
uh i been as numbers adjusted by inflation uh or
00:02:46
what currency or unit at the using student or so likely
00:02:53
um we we think about this site information in the internet into large groups on the one hand we have all the
00:03:01
latch scale stuff uh for example text only section information
00:03:07
uh that's a cool broad category uh recently more used
00:03:13
videos and images so these are everywhere large scale and
00:03:18
they can be leveraged to build walls and on the other side we have what i can call knowledge which information
00:03:24
so it's not you rated smaller maybe has doesn't cover everything
00:03:28
up example knowledge cross so that is like a connection between entities
00:03:33
and the relationships among each other then we have databases so collections
00:03:37
of tables uh with very complete schemas and and types of columns
00:03:42
also maybe dictionaries with definitions of concepts in the in the in in the section of
00:03:47
these two things we have a lot of the special language so for example tables and charts
00:03:53
uh uh so they are not fully structured but they still
00:03:58
try to organise information in a way that can be communicated
00:04:01
um we're actively and different they are points can be compared
00:04:06
um properties of entities are line in in the says once we
00:04:11
put them in a way that we can use and to be of a table or chart so we focus on
00:04:17
this in the section because we really that uh because the nature of large scale is gonna cover a lot of different
00:04:23
uh topics and domains the languages but because of his knowledge which
00:04:29
uh it's going to be able to organise and compare information more effectively
00:04:33
so ah we think taking advantage of this is a good opportunity ah for
00:04:39
verifying claims of finding such information on the web to
00:04:44
be used mm to improve generative models justice uh the
00:04:49
the point it is in this reaction is that in common crawled or some sky has has some estimates that
00:04:55
ah there around more than half a billion relational
00:04:58
table so those are not just the organisational table of
00:05:02
the layout of the web site but instead a organising information
00:05:06
for example each row represents an end to be uh some
00:05:09
uh each column some attribute ah so this is great we can try to leverage this and take advantage of this opportunity
00:05:17
ah the status can be any kind of the main so for example by the health
00:05:21
ah scion is of course lots of sports results if that's your cup of tea
00:05:27
i'm fine as you know progressives stocks uh you know or reports of companies
00:05:33
ah and also a lot calmer stuff comparison of products that are any any kind of them and you
00:05:39
can you can think of so structure for this short
00:05:44
talk we're gonna start by talking about the work that
00:05:48
our group has been doing in how to find the relevant
00:05:51
tables to verify claims or answer these these queries and questions
00:05:55
then also once we found the relevant maybe that's how we used to verify
00:06:00
uh the information so how to read the starter content and then moving on from just tables
00:06:06
uh to all types of the show language as the title of the top promises
00:06:11
and then closing up with some hope and challenges so how we find these uh
00:06:18
golden tables that we we wanna use to verify
00:06:21
potential claims uh let's say we have a claim
00:06:26
we want to verify solely about came in second
00:06:28
income else in the nineteen ninety nine south asian games
00:06:33
so there there is that happens to be one nice table out there in this case for my comedians in which media
00:06:40
that has the medals ah for this competition the citation games in nineteen ninety nine
00:06:46
and you know if we check is yeah it's uh if we check the
00:06:52
the all male cologne we you'd seen the true that about came in second
00:06:59
uh so we would like to find this table you know the to verify this claim so we know to do that we need to
00:07:06
first uh train sound system that can uh associate all that index of
00:07:13
day was that we can have on the web with the corresponding games
00:07:16
so the way we do this uh is by putting
00:07:20
everything in some high dimensional space of like a vector
00:07:24
and then we went is that just to be close to each other so we won the better for this table
00:07:28
and the vector for this claim to be very close closer than any other random table any other on them
00:07:34
claim this is uh in principle a hard to do
00:07:40
upscale because there's not a lot of data about correspondence between
00:07:44
um claims and tables so this is part of the
00:07:48
recording point of this thought that there's multiple axes to
00:07:51
making the system so one is the models and they are that's used to be so this isn't on the
00:07:57
data problems so the way we came up to is a couple of years ago was let's try to ah
00:08:06
kick start such a system by uh having some and
00:08:12
large data set off but they show claims and i
00:08:15
say tables and we do that by doing a bit of a trick which is we take a high quality
00:08:22
corpus of tables for example the table so we'd be via and we look at all this and this is
00:08:27
all that i'd chastened appear close to either reference to maybe link to it uh so we will kind of
00:08:34
this fake you know say let's take any of the
00:08:37
any one of the sentences and they asked the nearby tables
00:08:42
and we will kind of pretend babies to claim and try to train the system uh
00:08:48
to be close in these high initial space of the better for these claim and the
00:08:54
lot came for the centres and the vector for this table should be nearby
00:08:59
so by doing this we collected some corpus of a six million or so tables
00:09:05
uh which then we can extend to more languages uh look at more websites be only keeping yeah but
00:09:10
as a proof of concept gives you the point um and
00:09:13
we can trace system to to help us retrieval find this
00:09:17
corresponding tables uh and then so that's the from the data point of view um from the marketing point of view why we
00:09:25
used was is something that is call contrastive learning you you
00:09:29
train by having a positive example am also indebted example to show
00:09:34
the contrast i'm so not don't wanna get into the mouth
00:09:39
here but the idea is that we have more people payers of
00:09:44
uh associated tables uh and queries or sentences
00:09:49
and we won the corresponding pairs to be closed in this back to space and in the ad appears to be far away
00:09:56
so we can maybe laid out this this double angie matrix
00:10:00
where we have the first query or claim to be close
00:10:04
to the first table the second one too because the second table editor so those are the classes that we have here
00:10:10
and then in contrast to one the things off diagonal to be far away from each
00:10:15
other um because uh yeah um if we want if we only have a positive signal
00:10:22
and we want these points in high dimensions to be close to each other everything would
00:10:26
collapse into nothingness so we need the contrast to say the things that should be far away
00:10:32
the thing is i division for way and that i think should be close to each other so the streets and
00:10:38
houses to the trainees to train the system um and
00:10:44
the other aspect and more inside is that up until then
00:10:49
and even to these they the way these um watch models transformers for example uh otherwise
00:10:55
behind 'em g. p. d. or eagles gemini
00:10:59
et cetera may in code text in asked to
00:11:05
attributes on the one hand they have the talk and representations so what
00:11:11
word is this or what part of the war this is to regulate
00:11:15
and then what position does these uh word
00:11:18
having a sequence so we added another to um
00:11:25
the presentations are bad things to say we are in the to
00:11:29
the mission layout of your website or your table this information lies
00:11:34
so this allows us to not just look at the portion of text but also have a richer presentation that takes account of
00:11:40
these uh we shot aspect of this image destruction so putting it
00:11:45
all together and and don't know wanna worry too much with results
00:11:49
um we were able to be the system that could find um this source
00:11:55
abs they was much more accurately um the blue line is like is strong
00:12:02
baseline system that only looks text and then we have another
00:12:07
ad then system that it's relying on a machine or in
00:12:10
the machine and model and then the last the systems otherwise
00:12:15
we build so the first one the orange caught on red
00:12:19
yeah is him money lad this big table which you will be t. r.
00:12:23
is a model that the name that we had for this and then we also
00:12:28
both for the improvements by adding some extra
00:12:31
tricks for example so that just using random elements
00:12:35
of the at one of the racial before uh you can uh get better results by looking at
00:12:40
ah hard negative examples so you can try to teach them
00:12:45
all one whilst examples of what about tables that don't match very
00:12:49
well with your clay so okay with on the table that we needed to brief i became so know how to do we verify
00:12:57
uh if the information section supported or contradicts the
00:13:04
content of the table so another thing that andreas very
00:13:09
uh probably mention is that we need benchmarks in order to be able to measure progress in these things
00:13:15
so that was that's when my benchmark is one is caught up but
00:13:19
um the idea was uh to collect a collection of tables uh and associated
00:13:26
change that could be either and tales so that it's very five or refuted
00:13:31
by the by the stables so for example when intel tame would
00:13:35
be that greg norman steep docking town are from the same country
00:13:38
and if you're one could be that um
00:13:42
the greg norman billing me for a time mark
00:13:47
so what was the ad is the essence you can evaluate how well we can verify uh
00:13:53
the the claims recently stables so we get overly uh are
00:13:58
on twenty twenty some work on this data set by also
00:14:01
trying to leverage the adjacent sentences and table so we could
00:14:07
be the yeah so the idea is we can automatically build larger
00:14:11
and data set of such claims so we take perceives to be
00:14:15
any centre is although we do some quite filtering any centres that is adjacent to a table
00:14:21
uh as long as they have some entity
00:14:23
overlap so if greg norman for example appears mentioned
00:14:28
we take it to be a positive example and then we also be negative examples by corrupting
00:14:32
the sentences so we swap greg norman for another entity that maybe some appears
00:14:38
in the phone call on uh on the same problem so it's the same type
00:14:42
and then we can produce personally updates and we can do this in large scale and train systems
00:14:47
to do this but i don't wanna stay too long uh in this world but it's kind of more all school
00:14:54
um we also the little work more recently uh
00:14:58
on directly understanding and being able to be five part
00:15:02
leveraging larger models without any training data um so this is an example of
00:15:10
not quite fact checking also question answering which are very uh related tasks so we have a table from we
00:15:16
could be the and they have this question which can be had the most i just finished the top ten
00:15:21
um so that we should approach that people have been using is around the idea
00:15:26
of a chain of thought asking a large and which will like touch beatty gemini
00:15:31
et cetera two races step by step and try to come up with an answer
00:15:35
to these um and this is a complicated task it has it requires you to
00:15:41
do a bunch of 'em steps for example finding first what are the top ten
00:15:47
then getting the countries then i'm i'm beginning so counting my country getting the majority
00:15:56
are so the mode gets it wrong uh i like either really rushes into and
00:16:02
so we're trying to do all the steps one by one another up common approach
00:16:07
that works well we have a bay i said try to to us a kind of database and apply
00:16:12
some programming on top of it so you could imagine running some by the goal already some sequel code
00:16:19
it challenges that because this database structure
00:16:22
but not superstructure because sometimes semi structured that
00:16:26
you don't have all the information to run this s. a. s. q. l. sometimes it's quite tricky to right
00:16:31
things that are very easy to expressive language as simple problem is equal query um
00:16:37
for example to uh the fact that we don't really have it can check on
00:16:41
makes it very complicated uh to to run this quite so
00:16:46
well we proposed in this in this uh workplace person did
00:16:49
in a few months yeah when in a month in i clear is too
00:16:54
try to extend this notion of step by step reasoning to more semi structured data
00:16:59
by building intermediate steps which are in themselves tables so we could chain of table
00:17:06
so we'll have a mobile to value to some data
00:17:09
into something that has all the evidence he needs to
00:17:13
reliably answer this okay um oops i'm running out of
00:17:19
time so as you can see um the age difference steps
00:17:25
um i encodes for the reasoning that we do to solve the socks for example the say the first step
00:17:31
will um select the create a new column that has
00:17:37
the countries and then we can then do the application on
00:17:41
the uh teaching of the first into that go fast resource very good ah
00:17:49
so moving on from tables the more complex uh we show us we uh also looked at
00:17:55
a very cool benchmark much more recent in terms of understanding we show us on the web
00:18:00
um so for example charts here we have twenty five is kind of the sum of
00:18:06
all the last three places are um more than bush any um so here we build in
00:18:14
large scale we leverage large scale data uh
00:18:18
in the say stations to uh build models that
00:18:21
can understand charts but much better typically all the research in computer vision has been focused maureen
00:18:27
no images of cats and dogs and at the images and not so much in this type of the shows which we think are very important
00:18:34
uh so we build this foundation models by an leveraging large scale they don't the well as i said before
00:18:39
um this paper recall much out again because the cells in a bunch of academic benchmarks
00:18:45
um can skip this and our like this
00:18:51
war that is um and the review now and
00:18:55
we like i did in a few days is how to use these malls that we'd be
00:19:00
able to you know to verify um james basin on which information so we have for example
00:19:07
some came here that's as night night nineteen point eight percent of
00:19:10
companies so increasing fixed cost twelve point three point fifty seven so increased
00:19:14
ah so we'll this pipeline based on the most that we mentioned before so table understanding
00:19:20
and child understanding ah when we first have an hour channel something whoa extracts a table from
00:19:27
the chart and then um we can have a where table understanding table and they um
00:19:35
they mitigation model look at this game we actually break down extract being maybe the all uh
00:19:42
it's sentences and try to fact check each of them
00:19:46
um and then we have produced one binary special and
00:19:50
we can use this to either rate the it's specific
00:19:55
summary or specific claim regarding a chart and also to improve
00:20:00
uh existing um yet so we have a lot of human alice is saying
00:20:05
yeah we can uses quickly used a much more faithful
00:20:09
uh summaries of papal trains makes jets so i'm gonna close down with some open challenges um
00:20:16
that one would be with more complex documents so
00:20:19
not just one visualisation my mood dissertation simple graphics
00:20:23
uh or maybe interactive associations think of it like that one another one
00:20:28
is ah b. and this is very interesting with discussing circle for break
00:20:33
dealing with completing the sources which require one single
00:20:36
person points of view and finally by dating a complex
00:20:41
mark statements specially those making cashew i could come so i it's so take two ways
00:20:46
um uh we should languages available that massive scale that we should take advantage of it
00:20:52
uh today than modelling we can integrate into small a
00:20:56
small models there are cheaper also a a large the models
00:21:00
definitely it can efficiently ground generative a i to

Share this talk: 


Conference Program

Opening and introduction
Prof. Lonneke van der Plas, Group Leader at Idiap, Computation, Cognition & Language
Feb. 21, 2024 · 9 a.m.
Democracy in the Time of AI: The Duty of the Media to Illuminate, Not Obscure
Sara Ibrahim, Online Editor & Journalist for the public service SWI swissinfo.ch, the international unit of the Swiss Broadcasting Corporation
Feb. 21, 2024 · 9:15 a.m.
AI in the federal administration and public trust: the role of the Competence Network for AI
Dr Kerstin Johansson Baker, Head of CNAI Unit, Swiss Federal Statistical Office
Feb. 21, 2024 · 9:30 a.m.
Automated Fact-checking: an NLP perspective
Prof. Andreas Vlachos, University Cambridge
Feb. 21, 2024 · 9:45 a.m.
DemoSquare: Democratize democracy with AI
Dr. Victor Kristof, Co-founder & CEO of DemoSquare
Feb. 21, 2024 · 10 a.m.
Claim verification from visual language on the web
Julian Eisenschlos, AI Research @ Google DeepMind
Feb. 21, 2024 · 11:45 a.m.
Generative AI and Threats to Democracy: What Political Psychology Can Tell Us
Dr Ashley Thornton, Geneva Graduate Institute
Feb. 21, 2024 · noon
Morning panel
Feb. 21, 2024 · 12:15 p.m.
AI and democracy: a legal perspective
Philippe Gilliéron, Attorney-at-Law, Wilhelm Gilliéron avocats
Feb. 21, 2024 · 2:30 p.m.
Smartvote: the present and future of democracy-supporting tools
Dr. Daniel Schwarz, co-founder Smartvote and leader of Digital Democracy research group at IPST, Bern University of Applied Sciences (BFH)
Feb. 21, 2024 · 2:45 p.m.
Is Democracy ready for the Age of AI?
Dr. Georges Kotrotsios, Technology advisor, and former VP of CSEM
Feb. 21, 2024 · 3 p.m.
Fantastic hallucinations and how to find them
Dr Andreas Marfurt, Lucerne University of Applied Sciences and Arts (HSLU)
Feb. 21, 2024 · 3:15 p.m.
LOCO and DONALD: topic-matched corpora for studying misinformation language
Dr Alessandro Miani, University of Bristol
Feb. 21, 2024 · 3:30 p.m.
Afternoon panel
Feb. 21, 2024 · 3:45 p.m.