Translating 10M lines of Java to Kotlin

(engineering.fb.com)

101 points | by ermatt 3 days ago

14 comments

  • aduffy 6 hours ago
    I’m skeptical of the value in doing this. There are a mountain of tools like NullAway, ErrorProne, Immutables that make it so much easier to write safe code in Java. New developments in the language like first-class record types improve the DX as well.

    I think Kotlin helped push Java in the right direction, but at this point it seems like a weaker choice. Spending years to migrate a massive Java code base like this feels like wasted time.

    • t-writescode 6 hours ago
      I, personally, happen to like writing in Kotlin more than Java - with a lot of experience in both (though admittedly, all? the Java I wrote is in the pre-Streams style).

      I like:

        * data class Foo(val theImplementationAndStyleOfDataClasses: String)
        * elvis?.operator?.chaining
        * the order of variable declaration (type afterward)
        * how function pointers at the end of a method call can be auto-inserted as the next curly bracket
        * how you can have fun foo() = returnValue; in one line.
        * fun `method names that can have spaces - and dashes - in them`()
      
      The preceding 3, combined, allow for:

        @Test
        fun `much easier testing`() = commonTestSetupWrapper {
          // the code under test
        }
      
        * val immutable by default
      
      While I agree that Kotlin has definitely helped push Java in the right direction; and I agree that it's probably not especially necessary to migrate 10MM lines of Java code to Kotlin, especially since they're fully interoperable, I definitely would prefer writing _new_ code in Kotlin over Java for the for-the-lazy devex improvements, if nothing else.

      fwiw, my "good at it" programming history follows the lineage in historical order of:

        * Java (8?? years of it including competition programming)
        * C# (5+ years)
        * Python (2016ish to current)
        * Ruby (3-ish years, lots of Rails)
        * Kotlin (2-3 years, through current - written over 40k lines of Kotlin in the last year, alone)
      • 0cf8612b2e1e 5 hours ago

          `method names that can have spaces - and dashes - in them`
        
        Eugh, that is a turnoff. Maybe I have programmer Stockholm’s, but at least with a single connected word, I can always double click to select the token. Maybe, I might have wanted dashes at some point, but spaces seem like a step way too far.
        • ditn 4 hours ago
          Generally these are only used in test methods, which is fine. I've never seen anyone use them outside that.
          • recursive 2 hours ago
            What benefit do they provide in testing scenarios? I've never written Kotlin, but from an outsider's perspective, it seems like a slim benefit, outweighed by the cost of the mere existence of this syntactical oddity in the language's grammar.
            • t-writescode 2 hours ago
              When writing tests, you can name the methods more useful things, such as:

                class MyTestableClass {
                  fun `methodName - when input does not parse to valid regex throw exception`() {
              
                  }
                }
              
              It's pretty clear what is under test in a situation like that. That's basically the only situation I ever see it used in (and would code-smell the heck out of it if I saw it in other circumstances).

              People who are familiar with RSpec-style testing are very used to this sort of thing.

                describe MyTestableClass do
                  context 'input parsing issues' do
                    context 'when not valid regex' do
                      it 'throws exception' do
                        ...
                      end
                    end
                  end
                end
              
              Anecdotally, I've also found that such style naming for tests allows me to write out the desired names for all the tests ahead of time more easily and then implement them. That happens to be my flow.
              • codedokode 1 minute ago
                I like how tests are made in Python - you don't even need classes, just functions, and use a single assert keyword for everything. Also it's easy to parametrize them using decorators.
        • thaumasiotes 1 hour ago
          > Maybe I have programmer Stockholm’s, but at least with a single connected word, I can always double click to select the token.

          If you're using anything that can do syntax coloring correctly, you can still do that.

      • NIckGeek 4 hours ago
        fwiw in modern-Java they have data classes:

          record Foo(String theImplementationAndStyleOfDataClasses) {}
        
        The string is final. While sadly there is no elvis operator, the Optional type helps:

          evlis.map(e->e.operator).map(e->e.chaining)
        
        I still strongly dislike working in Java but Java 23 is a long way ahead of the Java 6 I first learnt Java with.
      • ncallaway 3 hours ago
        I also really like conditionals like switches and ifs returning values.
      • akoboldfrying 3 hours ago
        Honestly, none of those differences you listed seems especially compelling to me, except possibly for the ?. operator.

        What would be compelling: Array types that don't hate generics, or generic collection types that don't hate primitive types. Does Kotlin improve on Java at all here? It's such a pain having to remember this bonus complexity when all I want is a consistent way to manage a bunch of things. (I suspect not, as I think the limitations are at the JVM level, but I don't know.)

        • t-writescode 3 hours ago
          Out of curiosity, what are you wanting that autoboxing doesn't resolve?
    • spankalee 5 hours ago
      As long as they're already writing new code in Kotlin, translating the existing code makes a ton of sense, if they can do it cost effectively (which is sounds like they did).

      One of the huge problems with a language migration is that you're left with old and new and all the parallel tooling, context switching, and impedance mismatches. There's often a claim that the existing code will be migrated eventually, but IME that doesn't actually happen with large code bases. Then if there's ever a third migration, you're left with three languages in play.

      It's much better to aim for 100% migration, as close to 100% automated as possible. Then when you're done, you're really done. Maintenance, training, and the next migration will be easier.

    • m0zzie 4 hours ago
      The value in the conversion of existing code in this particular case isn't 100% clear to me either, but I think calling Kotlin a weaker choice than Java at this time is naive, particularly when preceding that with "there are a mountain of tools" that you can bolt on to Java to give it features that are built in to Kotlin.

      What makes Kotlin such a strong choice for many orgs today is its batteries-included multiplatform capability. We are able to write data models, validation logic, business logic, etc just once and compile to JVM, WASM, and native targets. For orgs with very large codebases and frontend applications (web + iOS + Android) this is an attractive capability because we can have a single codebase for a ton of core functionality, and have each frontend import a library which is native to its own platform.

      Of course many technologies with this promise have come and gone over the years, but this is the first one with a strong backing that has allowed us to _natively_ interoperate with each target platform.

      I believe these are all driving factors that have been pushing well known companies, that were previously Java shops, toward Kotlin. If you speak to a broad range of people in the industry you'll find many more orgs moving from Java to Kotlin than from Kotlin back to Java. We can simply get more work done with less code and ship to all our frontend platforms, and unless Java can do the same, I don't see the industry moving in that direction.

    • halfmatthalfcat 5 hours ago
      I would argue it was Scala, not Kotlin, that has contributed to the push to make Java “better”.
      • desiderantes 5 hours ago
        Scala and Groovy were big pushes for Java 7/8. Their hypes died down after that release.
        • marwis 4 hours ago
          Where do you see pattern matching in Kotlin?
    • needlesslygrim 2 hours ago
      There are, at least in my opinion, many more reasons to use Kotlin than null-safety and data-classes/records, especially on Android (Jetpack Compose).
    • chasil 5 hours ago
      Is Oracle a factor?

      Does a Kotlin codebase have more safety from a legal perspective?

      • nradov 5 hours ago
        The legal issues around Java between Alphabet and Oracle are settled at this point and no longer a risk for third-party software vendors. But it's pretty clear that Android is moving away from Java, so anyone with a strategic commitment to that platform has to plan around that reality.
        • retrodaredevil 3 hours ago
          It's safe to say that Android is definitely moving away from Java in terms of new language features. I mean, if you want to support older Android versions and use modern Java features or newer parts of standard libraries, you'll usually have to rely on desugaring or making sure you're using classes that are supported in Android.

          IMO, Android is moving away from modern versions of Java. Java and its underlying standard library will always play a big role in Android development.

          The way I see it, Kotlin makes a lot of sense for Android development because Kotlin can introduce new things in its standard library that make older versions of Java look ancient. It's not like you can use new Java features in Android, so using Kotlin makes people not care as much about new features of Java that are only available in modern versions of Java.

      • chii 5 hours ago
        unless you're not using the jvm (which is owned by oracle, despite being opensource), you won't have any difference between kotlin and java from a legal perspective tbh.
        • a57721 4 hours ago
          > unless you're not using the jvm (which is owned by oracle, despite being opensource)

          It's not "the JVM"; JVM is a spec that has many implementations, you are probably referring to Oracle JRE/JDK.

    • nradov 5 hours ago
      Meta operates at such a large scale that the engineering management decision process becomes qualitatively different from smaller organizations. They can justify enormous investments in keeping their code base fresh for small improvements in productivity and quality.
  • hitekker 2 hours ago
    At my last job, the management greenlighted a full rewrite in Kotlin in order to attract/retain developers bored with Java and Python. The actual business project was boring since all the important design work was already finished by the architect. No language rewrite, no interested devs. So management made a quiet trade with ICs where everyone got what they wanted at the cost of future maintenance.

    I learned that social whims (developer fun, preferences , dopamin) are weighted as much as technical rationales (performance, maintenance)

    • azemetre 1 hour ago
      How would you say it panned out? Were you able to hire people easily and did retention go up?
      • hitekker 1 hour ago
        Initially, Kotlin attracted mid-level folks to transfer. I recall a roaming staff engineer who lead the project for a bit, and then wisely rotated outside of the org.

        Long-term, the folks who remain are stuck and unhappy. The business got what it needed and moved on, leaving a system half in maintenance-mode, half feature-complete. Any Kotlin-only changes are mostly just bug-fixes or glue work with other team's SDKs. Any new features carry a lot of risk because the few people motivated to drum up new business (and write up requirements) left or quiet-quit.

        In a weird way, the project naturally selected for people who would get stuck maintaining the artefacts of the project. It's a paradox that I can't fully comprehend.

  • keyle 5 hours ago
    I have a genuine side question... Why does Meta have 10M lines of Java for their Android code base? What's in it?
    • aithrowawaycomm 4 hours ago
      There's a lot of subtlety in what exactly a "line" is, especially for Java and especially for a legacy enterprise codebase: hard to say that Meta's 10M is actually twice as big as someone else's 5M.

      I have no clue what's in their code, but I would expect a lot of "almost redundant" stuff around specific Android versions, manufacturers, etc, which can really pile up.

    • t-writescode 5 hours ago
      Java's how you wrote Android apps before Kotlin came out. I expect they have __all their existing Android code__ in Java. 10MM lines doesn't seem out of line for a very, very established company with 100k developers across several products. It's one of the 3 main platforms that people interact with Facebook on and so they'd want it to be as good and as fast as possible, especially on older phones for the time when Android phones were new.
      • keyle 4 hours ago
        I'm well aware of the history of Android, Kotlin and Java. Still my question stand...
      • wg0 5 hours ago
        Why they wouldn't use React Native?
        • t-writescode 5 hours ago

            * Facebook Messenger came out on Android in 2011 [0]
            * React Native came out in 2015 is written in C++, Java, JavaScript, Objective-C, Kotlin [1]
          
          [0] https://en.wikipedia.org/wiki/Messenger_(software)

          [1] https://en.wikipedia.org/wiki/React_Native

          So, even if they did use React Native, they still have 4+ years of code in the original language; and, React Native doesn't stop the use of Java

          • lawgimenez 5 hours ago
            Most importantly, why was Threads not written in React Native?
            • Yiin 4 hours ago
              Because Instagram wasn't, and Threads is pretty much a copy pasted Instagram codebase (for good reasons).
          • rycomb 5 hours ago
            I'm pretty sure wg0 was being facetious.
    • smrtinsert 4 hours ago
      Really the only question I have on this post. So many of their non ai updates make me roll my eyes
  • valenterry 5 hours ago
    Crazy project.

    Personally I find that it's an interesting indicator of the capability of the programming languages. Moving from language A to B can be extremely easy if B is as powerful or more powerful in terms of expressiveness than A. It can be an absolute horror if it is less powerful.

    Being not null-safe in fact brings additional expressiveness. I guess most would argue that it's not a good type expressiveness. Nonetheless it is a type of expressiveness that can cause trouble during such a transition.

    In general it feels like Java is making faster progress than Kotlin nowadays. I wonder where Kotlin would be if it weren't for Android. I hope those folks don't have to migrate back before they even finished.

    • 0cf8612b2e1e 5 hours ago
      Without Android, Kotlin would just be Cool Alternative Language #4721. Java has been a safe, non-controversial pick for decades. Who is going to endorse writing their hot new project in it just because some IDE uses it? When Google says they support a technology going forward, that gives the project some cache it otherwise never would have received.
  • freeqaz 6 hours ago
    What are the benefits of Kotlin over Java? Something I wish they went into!
    • t-writescode 6 hours ago
      I commented here: https://news.ycombinator.com/item?id=42483538

      Mostly, I find it far less verbose with a couple huge convenience features that I'm constantly using and would feel very lost without; and, they all come as part of the language, rather than bytecode hacks / pre-processors, like lombak

      • smrtinsert 4 hours ago
        Lombok is absolutely painless.
        • t-writescode 2 hours ago
          I admit I have no direct experience with it. For my perspective on it, I rely on one of the best developers I've ever worked with - especially when it comes to debugging and deep investigation and she *hates* it. If I had to assume (we don't work together anymore), it's because it did some freaky stuff that made debugging the weird bugs really, really hard.
    • TheSociologist 5 hours ago
      Steve Yegge has a fun article on why someone might prefer kotlin over Java: http://steve-yegge.blogspot.com/2017/05/why-kotlin-is-better...

      Do note that Java has quite a few features now that it didn’t at the time of writing.

      • billisonline 5 hours ago
        “More specifically, from JetBrains, the makers of the world-famous IntelliJ IDEA IDE, whose primary claim to fame is its lovely orange, green, purple and black 'Darcula' theme”

        This must be a bad attempt at a joke, right? Darcula is a nice theme (I personally prefer high contrast), but surely IntelliJ’s code inspection and automatic refactoring have always been its claim to fame.

        • TheSociologist 4 hours ago
          Yegge’s writing has a pretty sardonic sense of humor. I like it but I’ve seen it put off some folk.
        • sitkack 4 hours ago
          Themes are the cup holders of the IDE world. You can't sell one without them.
    • cute_boi 6 hours ago
      > maximize our gains in developer productivity and null safety

      Java is to too verbose. Kotlin have features like null safety, better functional programming, better coroutines, advanced sealed classe. Java virtual threads is still not ready and development was very slow. Blame oracle for being too complacent.

  • spullara 5 hours ago
    This seems like a huge waste of time unless they expect Google to deprecate Java on Android - which isn't impossible.
    • politelemon 5 hours ago
      Some of the newer material libraries are kotlin only. It does seem to be happening.
      • wg0 5 hours ago
        Kotlin can't be called from Java land?
  • nutanc 5 hours ago
    I am surprised they did not use LLMs like Claude or maybe even train their own Llama version to do this. In my experience LLMs have been very reliable in translating code.
    • snovymgodym 1 hour ago
      It seems pretty obvious to me why deterministic code translation is preferable for something like this.
    • lazide 4 hours ago
      There is zero chance they’d end up with a functional code base after attempting to do this.
  • neocron 5 hours ago
    Ah, here we go again with HHVM and Hack ...

    The only reason fb is able to do this, is the billions of $ behind it... For everyone else this is just pure idiocy

    Sure if you like Kotlin, use it for new software, but rewriting milliona loc for some marginal gains... that how businesses fail more often than not

    • forgot_old_user 5 hours ago
      I think its great for recruiting. This signals to the world their investment in making Devs happier (one of top two reasons mentioned was "devs were happier with Kotlin")
  • latenightcoding 6 hours ago
    > we decided that the only way to leverage the full value of Kotlin was to go all in on conversion

    Could someone expand on this please.

    • acaloiar 6 hours ago
      In addition to what @phyrex already pointed out, without any Java in the code base, they probably hope to hire from a different cohort of job seekers.

      Younger developers are far less likely to know Java today than Kotlin, since Kotlin has been the lingua franca for Android development for quite some time. Mobile developers skew younger, and younger developers skew cheaper.

      With Java out of the code base they can hire "Kotlin developers" wanting to work in a "modern code base".

      I'm not suggesting there's something malevolent about this, but I'd be surprised if it wasn't a goal.

      • trollbridge 5 hours ago
        I think you're on to something here. When recruiters contact me about Java jobs, I tell them my level of interest in a Java job is about as high as RPG or COBOL, and that I'm wary of spending time on a "legacy" technology. Most of them are fairly understanding of that sentiment, too.

        If I had someone call me about Kotlin, I would assume the people hiring are more interested in the future than being stuck in the past.

    • phyrex 6 hours ago
      From the article:

      > The short answer is that any remaining Java code can be an agent of nullability chaos, especially if it’s not null safe and even more so if it’s central to the dependency graph. (For a more detailed explanation, see the section below on null safety.)

      • trollbridge 5 hours ago
        One of my biggest gripes with an otherwise strictly typed language like Java was deciding to allow nulls. It is particularly annoying since implementing something like NullableTypes would have been quite trivial in Java.
        • t-writescode 5 hours ago
          Would it have been trivial and obvious for Java (and would Java still have been "not scary") back in the 90s when it came out?
          • nradov 5 hours ago
            It wouldn't have been particularly hard from a language, standard library, and virtual machine perspective. It would have made converting legacy C++ programmers harder (scarier). Back then the average developer had a higher tolerance for defects because the consequences seemed less severe. It was common to intentionally use null variables to indicate failures or other special meanings. It seemed like a good idea at the time
            • kelnos 5 hours ago
              > It would have made converting legacy C++ programmers harder (scarier).

              And that, right there, is all the reason they needed back then. Sun wanted C++ developers (and C developers, to some extent) to switch to Java.

  • mukunda_johnson 6 hours ago
    Wait a minute, this isn't an AI article...
  • zahlman 6 hours ago
    This article absolutely reeks of ChatGPT to me. For example:

    >With this in mind, we set out to automate the conversion process and minimize interference with our developers’ daily work. The result was a tool we call the Kotlinator that we built around J2K. It’s now comprised of six phases:

    followed by a list of descriptions of the "phases" which only sort of make sense for the name given to them, and are utterly incoherent as actual phases in a process (and grammatically inconsistent). For example, one of the cited "phases" is... "headless J2K". In other words: they have one piece of software that wraps another, and it - gasp - doesn't use the wrapped software's GUI. Aside from being entirely unremarkable, that's neither a phase in a process nor a component of a tool. It's a fact about the component.

    LLMs write like this all the time - and it's clear evidence that they do not, in fact, do anything like reasoning, even if they can sometimes be manipulated into generating a second piece of text that resembles an analysis of the first one. The resulting description is so weird that I question whether the authors actually checked the LLM's output for accuracy.

    Any human writer who gives a damn about good writing and has any skill, would not allow "it" to refer to "the conversion process" two sentences back when "a tool called the Kotlinator" has been introduced in the interim (or, if that were the intended referent, would notice that tools are not "comprised of phases"). Such a writer would also not come up with abominations like "the conversion process is now comprised of six phases" where "we now use a six-phase conversion process" would be much clearer. Certainly, a six-point bullet list produced by a competent writer would label them in a grammatically consistent way (https://en.wikipedia.org/wiki/Parallelism_(grammar)) - not with an abstract noun describing an action, two participles, two concrete (well, as concrete as software ever is) nouns and a command (who, exactly, is being told to "build" the "error-based fixes" - whatever that means - here?).

    I'm starting to feel like Mark Twain.

    ----

    On the other hand, I was cynically expecting some mention of using AI for the actual task, and that doesn't seem to be the case.

    (Also, the "reactive" web design is broken. The page overflows horizontally for some range of window widths, without causing a horizontal scrollbar to be added.)

    • LegionMammal978 5 hours ago
      Calling it a 'phase' doesn't seem that weird to me? For each file, they go through a number of steps to translate it, and one of those steps is to run the J2K tool on it. The next section is just describing how they implemented that step: the J2K tool is normally enmeshed into the rest of the IDE, but they managed to jerry-rig a solution to run the relevant code by itself, without running the rest of the IDE with it every time.
      • zahlman 5 hours ago
        It could possibly make sense if the phases were described as e.g. "performing a deep build, preprocessing, running J2K, postprocessing, linting and applying error-based fixes" (i.e., all participles). But then the descriptions should be about how those phases are implemented (i.e., this is the place to mention the work of setting up a headless version of J2K so that it could be run in an automated pipeline) and how they contribute to getting the desired result. The description "The J2K we know and love, but server-friendly!" is utterly useless here, and also practically the archetype of what ChatGPT writes in "promotional material" mode.
    • Philpax 5 hours ago
      You are tilting at windmills. The article exhibits the kind of English I'd expect from someone sharing a technical development in an acceptably-perfunctory style for a company blog.
    • amelius 6 hours ago
      > cynically expecting some mention of using AI for the actual task

      Well, you could combine AI with correctness preserving transformations, so you get the best of both worlds (i.e., correctness AND a translation that keeps the resulting code close to what a human would write).

    • spankalee 5 hours ago
      What's not a "phase" about running J2K?
      • zahlman 5 hours ago
        The phase is, as you say, (the act of) running J2K - not a property (being headless) of the J2K implementation used.
  • rolandthomas 1 hour ago
    [dead]
  • gerdesj 5 hours ago
    "Android development at Meta has been Kotlin-first since 2020, and developers have been saying they prefer Kotlin as a language for even longer."

    Not one link to an opinion piece or two regarding: "kotlin vs java". The nearest thing I found was "What makes Kotlin different".

    This sounds somewhat like a debate about which Germanic language is best. German, Dutch and English are all "Germanic" but which is "best"?

    Obviously: That's the wrong question to ask and so an answer is doomed to failure.

    In the end, does your generated machine code implore the CPU and associated hardware to do what you want it to more efficiently in some way that is not an abstraction?

    Pissing contests rarely excite me. Why did you do this?

    • kelnos 5 hours ago
      > Why did you do this?

      You literally quoted the "why": their developers prefer writing Kotlin over Java for Android development. That's it. They don't need further justification. They didn't need a "Kotlin vs. Java" comparison, and they're not really evangelizing Kotlin all that much. They're simply stating a fact for their organization: Kotlin is a better fit for their developers than Java is.

      > In the end, does your generated machine code implore the CPU and associated hardware to do what you want it to more efficiently in some way that is not an abstraction?

      Most shops don't care about this too much. The most important thing is developer productivity.

      (And yes, this is why we have bloated garbage like Electron these days; sometimes some people value developer productivity to unhealthy extremes.)

      • gerdesj 5 hours ago
        So why bother posting on HN if it was inevitable?
        • erik_seaberg 4 hours ago
          Seems to me the article focused on how they did this big job, rather than why.
    • troad 5 hours ago
      I think a pissing contest is precisely what they were hoping to avoid by not linking to some "Kotlin v Java" blog post. That their developers prefer writing in Kotlin is a basic premise for the rest of the article, not its thesis.
    • occz 3 hours ago
      The target audience for this kind of article - Android engineers - do not need any more convincing on why Kotlin is superior to Java. This is a debate that has been settled in the Android community since a long time ago. I'd be willing to bet money on there only being a rounding error of Android engineers advocating the use of Java over Kotlin these days.
    • gerdesj 5 hours ago
      Blimey, DV'd within seconds!