Ariadne


Ariadne is a tool I made to help me discover and organize papers designed around the interaction of citation chasing. It leverages GPU compute to scale interactively to graphs with hundreds of thousands of nodes, allowing users to iteratively populate their graph with papers of interest and expand on connections discovered through exploring the graph.

Motivation

How does a user know what questions they want to ask when learning about a new research topic, or when exploring a new space through data? If we can represent the underlying data as a graph, and explicitly make the structure known to a user, it opens up opportunities for various interactions to take place.

To learn about a topic, you might engage in citation chasing. A typical approach would be to skim the bibliography of a paper, then manually look up papers that piqued your interest. This is both tedious and time consuming. A slightly faster way of citation chasing might be to look at the citations of your starting papers in a research tool like Semantic Scholar or Google Scholar. Then, you could leverage their citation networks to quickly reach those papers that pique your interest instead of having to manually find them.

However, this still loses a useful piece of information – the connections between papers. At first, it might be possible to keep track of how a few papers are related, but as the search progresses the number of papers quickly turns into an unmanageable collection where these connections have to be rediscovered1.

By visualizing the graph structure during the chasing process, it’s possible to see how all the papers are connected, highlighting papers with many links to others in your chase (likely fundamental ideas to your search), or connections between papers you might have missed that warrant a second look.

You could also imagine that this would extend to finding connections between multiple research topics, or making sense of a field of research as a whole.

Approach

Opportunities for novel interactions

Many user interactions have traditionally limited the amount of data presented – either intentionally, to highlight what’s deemed to be most relevant, or due to past computational constraints. While these approaches are effective, I’m instead especially interested in what becomes possible when those limitations are removed. Interactions that work well at small scales don’t always extend to larger datasets, but by building tools that support both, we can combine their strengths, explore hybrid solutions, and discover new interactions adaptable across all scales.

Performance

Ariadne has been built with performance as a primary concern, offloading computation and rendering to the GPU whenever it makes sense to do so. At the time of writing, Ariadne performs graph layout using a GPU-accelerated Barnes-Hut algorithm utilizing a BVH built fully on the GPU. Label placement and text culling is done with compute kernels, and all rendering is done with a custom but fairly standard GPU graphics pipeline. The CPU is mainly delegated to handling user input, fetching data over network, and marshalling data to/from GPU buffers.

Local-first

Aside from the sources providing information about the papers and how they are connected, all processing happens on-device. Once fetched, the data is stored locally for all app interactions. Our personal devices are hugely powerful and doing things locally results in a more responsive experience under a wider variety of conditions, albeit at a hit to the user’s battery life. While I aim to not perform unnecessary and wasteful computations, I am prioritizing responsiveness for now. While this might not be as exciting for others, it’s something important to me.

Examples

Finding connections between multiple topics

Suppose that I know a bit about GPU Programming and a bit about Bounding Volume Hierarchies (BVHs), but I’m not sure about work related to building a BVH on the GPU. We’ll build two initial graphs to start, each from the first page of results (10 papers) on Google Scholar for both search terms. Ariadne will then also pull in those papers’ references and citations to populate the graph.

Our graph for GPU Programming looks like this: A graph of GPU Programming papers.

And our graph for Bounding Volume Hierarchies looks like this: A graph of Bounding Volume Hierarchy papers.

If we combine both graphs, we can see that there are a few papers that link the two graphs we started with (coloring GPU Programming in yellow, and BVH in blue): A graph combining the GPU Programming and BVH graphs.

But upon examining which papers in the BVH graph they are connected to, we find two survey papers and another one more generally about collision detection. Useful, but not quite what we’re looking for since I want to find the implementations more directly.

We can get better results by expanding the areas of the graph we’re interested in, so let’s try to expand the BVH graph a bit. Before doing so, there are a lot of papers which might not all be useful for what we want. We’ll start by finding papers that might be more interesting, and try restricting the papers we expand to just those that have at least two connections in our graph: A zoomed in view of the BVH graph with some papers filtered out.

And now that we’ve filtered down to a smaller set, we can expand the graph by fetching all those papers’ references, which in turn should hopefully add more connections to the GPU Programming graph: A view of the BVH graph with additional papers added.

The expanded BVH graph is now a massive hairball that’s difficult to make sense of, but there are still a few things we can do. By selecting all the papers that are bridging the two graphs, we can highlight which papers in the BVH graph they are connected to: A view showing the selected papers bridging the two graphs.

And zooming in to these papers a bit further, I can see some promising results – Fast BVH Construction on GPUs, Parallel Locally Ordered Clustering for Bounding Volume Hierarchy Construction, HLBVH: hierarchical LBVH construction for real-time ray tracing of dynamic geometry, to name a few.

NOTE

There’s one paper here that Google Scholar would have never found, which is Real-time KD-tree construction on graphics hardware. This paper doesn’t have anything to do with a BVH, at least in keywords. But, if you know a little bit about building a BVH, you might know that techniques for building k-d trees can often be adapted to building a BVH. This paper and Fast BVH Construction on GPUs are two of the earliest influential papers in this area of research.

A view of the papers in the BVH graph connected to the selected papers.

If we zoom out a bit, we might notice that four of the (orange) papers in the GPU Programming graph seem to be closer to the BVH graph than the rest of the bridging papers: A zoomed out view of the BVH graph, highlighting some papers that are closer to the BVH graph.

If we look at those four papers in more detail, we can see what they are.

NOTE

In particular, I want to point out Scan primitives for GPU computing and Designing efficient sorting algorithms for manycore GPUs. The proximity of these two papers to the BVH graph is not a coincidence. Many GPU BVH building algorithms rely heavily on performing scans and sorts during various steps of the construction process.

A zoomed in view of those four papers.

Footnotes

  1. ConnectedPapers showcases some of this potential, but limits the size of the graph to only around 40 papers.