xu3kev 5 years ago

This kind of practical research work is so intriguing! Is there a roadmap or a career path for beginners to join and work in this field?

  • fooker 5 years ago

    Step 1: Get a PhD.

    I'm not joking. While there are exceptional situations where you might get to work on interesting projects, by far the most reliable way to do it is through an academic degree.

    • one-more-minute 5 years ago

      Not at all. Only a small number of the people involved in this actually have PhDs (often in unrelated fields) and many of the most prolific contributors are undergrads who get involved via summer projects. If you're interested in internals, probably the best way to learn is to just join our slack (see julialang.org) and chat with all the developers.

      • ViralBShah 5 years ago

        Both are true. A PhD program is a great way to find a chunk of time to dedicate to a problem for a few years. At the same time, we've had contributors who were in high school when they started, through GSoC, and just generally interested folks in the community. On the other end, we've had tenured professors and even retired professors contributing to Julia.

        Eventually, it boils down to two things. Find a problem that you want to work on and engage in the community as one-more-minute points out. Second, find a way to support that - PhD, an interested employer, a university, a grant, or perhaps you have are just well off and can do whatever you want!

        • xiaodai 5 years ago

          I want a full time Julia role so bad. Unfortunately i have a family to support and a mortgage to pay sp i cant quit my job unless i find something. But i have been contributing to Julia despite having a hectic baking career schedule. The fast `StatsBase.countmap(::UInt16)` for example was my first contribution!

          Don't mind a pay cut but I failed at my first attempt to land a (rare) Julia job in Sydney. Willing to relocate!

      • fooker 5 years ago

        Sure.

        How many of these prolific undergrad contributors end up working full time on interesting aspects of the project though?

ajtulloch 5 years ago

This is an incredibly impressive line of work. Huge props to the Julia team.

xiaodai 5 years ago

It's unfortunate but google docs are blocked at work. So why am I reading this at work? Cos it might be useful for my work.

I do wish more slides are hosted elsewhere not google docs. Because some companies are worried about the use of google docs especially if the data is sensitive e.g. banking and insurance.

  • throwawayjava 5 years ago

    Wow. Why?

    • victorNicollet 5 years ago

      Shadow IT. A client asks: if we cancel our subscription tomorrow and our competitor acquires your company next year, will our data have been deleted by then ? And if you have employees sharing customer data with each other over unsanctioned channels, you can't really answer "yes".

      Blocking Google Docs is by far the cheapest way to prevent its use, even considering the downsides of a complete block.

      • jacoblambda 5 years ago

        I'd recommend suggesting that they whitelist /preview as that is a view only mode that allows you to access external docs without leaking data to the world.

      • hencoappel 5 years ago

        Does that mean they block every single website where you can upload data? Dropbox, Gmail, etc? What's even the point given how easy it would be to find some way to send stuff over the Internet?

        • PeterisP 5 years ago

          Yes. It's trivial to block everything you can think of easily, and for everything other than that there's network monitoring and legal recourse.

          Why would you bother to look for some awkward way to work with customer's sensitive data over the internet, if you're going to get fired for finding that way? The main goal is to prevent random people from storing data in unapproved locations "because it was easier that way". You can easily make it not be easier, that cuts 95% of the sharing, and then the only practical reasons to do so are clearly malicious.

        • xiaodai 5 years ago

          Yes. Some allow gmail cos you cant upload large files anyway

        • seventhtiger 5 years ago

          How easy is it? Except for email and our approved file sharing service, there is practically no way to share files outside of our network.

          Dropbox, drive, icloud, you can easily firewall all of that.

    • xiaodai 5 years ago

      Assuming it's not a troll question: banks and insurance companies store sensitive financial information about customers. So they dont want any of their employees to use dropbox or the like

seanmcdirmid 5 years ago

At first, I read this as "getting a general purpose compiler from machine learning"...but I guess we aren't there yet.

I do wonder, however, how full differentiable programming languages will be supported by compilers in the near future.

pjmlp 5 years ago

Lots of great work being done with Julia, kudos to the team.

quasarj 5 years ago

wat

  • KenoFischer 5 years ago

    This is a talk we gave at the "Compilers for Machine Learning" workshop at CGO this past Sunday. Obviously the slides are missing some of the context of the presentation, but the core argument is that one shouldn't build compilers for machine learning, but rather build general purpose compilers that are flexible and extensible. From there, you can easily get all the same benefits as dedicated machine learning compilers without building a huge monolith. The rest of the slides highlight some of the work done in the Julia community to work in this direction.

    • ViralBShah 5 years ago

      Adding to the context, we wrote up a little post about the C4ML workshop: https://juliacomputing.com/blog/2019/02/19/growing-a-compile...

      Reproducing it here: The Compilers for Machine Learning workshop was recently held at CGO 2019. Since compiler techniques affect a large part of the machine learning stack, this workshop aimed to highlight research that incorporates compiler techniques and algorithms in optimizing machine learning workloads. The workshop included talks from various projects - Julia (Julia Computing), TVM (UW), Glow (Facebook), XLA (Google), nGraph (Intel), TensorRT (Nvidia), and the soon to release MLIR (Google).

      Our talk introduced the abstractions in the Julia language and the kind of compiler transforms involved in implementing them. We then had a deep dive into dynamic semantics + static analysis - our JAOT (Just-Ahead-Of-Time) analysis. Building on these capabilities, the Zygote system implements automatic differentiation, effectively treating it as a compiler problem, giving us differentiable programming for free. Finally, compiler backends for GPUs and TPUs give us high performance execution. All this comes together beautifully in Neural ODEs, which we had to show off as our first slide!

      • ViralBShah 5 years ago

        Also, since Yann LeCun said yesterday that Deep Learning needs a new language, this has become a major topic of interest.

        https://venturebeat.com/2019/02/18/facebooks-chief-ai-scient...

        Of course, our view is that Julia is one such language that people should consider seriously. The talk linked here is a peek under the hood and shows that differentiable programming in Julia is not a special add-on, but something that fits naturally within the language.

        • ced 5 years ago

          A long time ago, Yann LeCun wrote Lush, which was a numerically-focused Lisp dialect, with a focus on C interop. It might be one of Julia's closest sibling.

          > Lush is an object-oriented programming language designed for researchers, experimenters, and engineers interested in large-scale numerical and graphic applications. Lush is designed to be used in situations where one would want to combine the flexibility of a high-level, weakly-typed interpreted language, with the efficiency of a strongly-typed, natively-compiled language, and with the easy integration of code written in C, C++, or other languages.

          http://lush.sourceforge.net/index.html

          Is there more detail about his proposal? It must be well thought-out.

          • sitkack 5 years ago

            A Lisp with expressive macro support is at least what is needed. Nothing I know of nothing else that has the affordances to support internal DSLs and creative control flow.

            • yaantc 5 years ago

              Julia as hygienic macros. From their documentation [1]: "The strongest legacy of Lisp in the Julia language is its metaprogramming support. Like Lisp, Julia represents its own code as a data structure of the language itself."

              Parts of Julia are even implemented in Lisp [2], although they tend to be ported to Julia now IIUC. But it's clear the Julia developers are well aware of Lisp and its strong points.

              [1] https://docs.julialang.org/en/v1/manual/metaprogramming/ [2] https://discourse.julialang.org/t/the-role-of-femtolisp-in-j...

              • pjmlp 5 years ago

                Julia also feels a bit like Dylan, another Lisp offspring with Algol-like syntax.

              • sitkack 5 years ago

                That is super cool, I didn't know that.

    • jng 5 years ago

      Is there a video of the talk? Would love to watch it. Congrats on the work, it looks great! I'm just missing more detail, since having just the slides is a bit on the dry side.

    • SilasX 5 years ago

      How ... whatever was the reason for thinking that you needed a ML-specific compiler?

      Do people also think you need a timecard-tracking-specific compiler?

      • chrisseaton 5 years ago

        > How ... whatever was the reason for thinking that you needed a ML-specific compiler?

        As the slides say, you need the program to be differentiable. Do general purpose compilers make it easy to automatically differentiate a program? No, not until very recent work. So people thought a better idea until we figured out how to do that was ML-specific compilers and frameworks.

        > Do people also think you need a timecard-tracking-specific compiler?

        No, nobody thinks this, because timecard-tracking does not need any special properties such as being differentiable.

        The answer isn't as crazy as your extremely snarky question makes it out to be, is it?

      • KenoFischer 5 years ago

        Well, I do think there are some very real shortcomings that current general purpose compilers have when applied to machine learning (aggrevated by the fact that lots of machine learning code is written in languages with poor compiler support), so lots people looked at that, wrote small optimizers and got good performance. But then it turned out that researchers wanted to do more and more things in their ML models and these small optimizers are turning into full-blown compilers, with not a lot of thought about whether that is truly the correct thing to do.

      • mlevental 5 years ago

        do you know how modern ml works? do you know that every functional unit in a net needs to be differentiable and so needs to carry around either dual numbers (forward mode) or adjoints (reverse mode)? it's not as simple as just writing a math library.

        • sitkack 5 years ago

          Please give the parent a benefit of the doubt. Do you even comments are alienating. It could have been written as

          Every functional unit in a net needs to be differentiable and so needs to carry around either dual numbers (forward mode) or adjoints (reverse mode). These requirements necessitate new languages and compilers that current imperative representations don’t currently support.

        • ViralBShah 5 years ago

          There's an extensive discussion of AD in the slides (the topic of this HN post), and how it is done in Julia. Precisely because it is not as simple as writing a math library is why you need language and compiler support.

          • mlevental 5 years ago

            yes that's what i was pointing out the poster i responded to

      • samcodes 5 years ago

        Good point, timecard-tracking software and general purpose differentiable computing languages are roughly the same level of complexity.