Programming Language Network

Introduction

Programming languages are tools that developers use to communicate instructions to computers. They encompass a set of rules and symbols that define the syntax and semantics for writing code.

As developers gain experience and work on different projects, they often specialize in a set of programming languages. This specialization can be influenced by various factors such as personal interest, job requirements, industry trends, and project needs.

The data used for this visualization is from the 2023 Stack Overflow Survey. Read more about it here.

Methods

The data utilized in this visualization consisted of the programming languages they had worked with in the past year. Using this data, a co-occurrence matrix was generated and utilized for the visualization, which can be accessed in the public/data directory of the source code.

Each node in the visualization represents a programming language, with the radius of the node being proportional to the cube root of the number of respondents who have worked with that language. The simulation employs three distinct forces on each node: attraction and repulsion forces between nodes, and a radial acceleration to maintain stability.

The attraction forces between nodes are based on the edges in the network, which are undirected. The strength of the attraction force is determined by the square of the higher overlap between two languages. For instance, if 5% of C programmers have worked with Assembly, but 90% of Assembly programmers have worked with C, the edge weight is a coefficient multiplied by 0.81. On the other hand, the repulsion force is inversely proportional to the squared distance between two nodes.

Results

One notable observation from the simulation is the tendency for certain groups of languages to navigate towards each other despite shuffling. For example, languages like C, C++, and Assembly tend to cluster together, as do Microsoft products such as Visual Basic .NET, C#, VBA, PowerShell, and F#. Web-related languages also exhibit a tendency to cluster together.

By graphing the interconnectedness of programming languages, this visualization can offer insights into the factors that shape developers' language choices and specialization paths. Alternatively, it could be used to confirm or challenge the biases and stereotypes about developers of certain languages, as it provides a visual representation of the clustering tendencies of languages based on their real-world usage.