This talk was presented as part of JuliaCon 2021.
Abstract:
Learning from raw data input is one of the key components of many successful applications of machine learning methods. While machine learning problems are often formulated on data that naturally translate into a vector representation suitable for classifiers, there are data sources with a unifying hierarchical structure, such as JSON. This talk will describe Mill.jl and JsonGrinder.jl, which offers a theoretically justified approach to solve machine learning problems with these data sources.
For more info on the Julia Programming Language, follow us on Twitter: https://twitter.com/JuliaLanguage and consider sponsoring us on GitHub: https://github.com/sponsors/JuliaLang
Contents
00:00 Introduction and Motivation
01:07 Possible solutions to the device ID problem
02:52 HMIL approach
06:32 Comparison to SOTA
07:20 Theoretical guarantees
08:29 Mill.jl: node types
09:11 Mill.jl: representing leafs
10:58 Mill.jl: representing dicts
11:59 Mill.jl: representing arrays
13:19 Mill.jl: working with the whole sample
14:26 Mill.jl: training a model with JsonGrinder.jl and Flux.jl
19:58 Summary
S/O to https://github.com/SimonMandlik for the video timestamps!
Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/JuliaCommunity/YouTubeVideoTimestamps
Interested in improving the auto generated captions? Get involved here: https://github.com/JuliaCommunity/YouTubeVideoSubtitles
1 Comments