Last updated: 3/24/2022
Lucata Pathfinder Getting Started¶
The Rogues Gallery hosts two systems from Lucata (formerly known as Emu Technology): The Gen1 Emu Chick, an 8-node desktop-style system, and the Lucata Pathfinder, a two-chassis system with 16 nodes and 24 cores in each node for a total of 384 cores. We currently also have 2 Pathfinder chassis on loan from Lucata, which are denoted as PF<2-3>.
- Slurm will be deployed in Summer 2022.
Using the EMU simulation and compiler tools¶
The current toolset, documentation, and examples are available on the rg-emu-dev VM and other nodes as a module. Note that the Pathfinder currently requires the use of the latest 22.02 tools.
- rg-login.crnch.gatech.edu: primary login VM for Rogue’s Gallery. Use this VM to log in to another node for testing and simulation from off campus.
- rg-emu-dev.crnch.gatech.edu: VM for Lucata compilation and simulation
- pf<0-3>.crnch.gatech.edu: Lucata Pathfinder chassis for HW execution
- karrawingi-login.crnch.gatech.edu: The main EMU Chick node, used for login and transferring files to a specific node/set of nodes. NOTE: You cannot run any code on this node and will need to copy your code to n0-n7 on the Emu Chick machine.
When getting started, we highly recommend checking out the Lucata Pathfinder Programming Manual (requires GT Github login) and read through Chapters 1,2,3, 5.1, 6, and 7. This will give you a basic understanding of the Cilk-based workflow and Lucata-specific APIs and tools.
As shown in the figure above, the suggested Lucata workflow combines 1) x86 functionality testing, 2) simulation of code on a VM, 3) execution on a single node of the Pathfinder system, and 4) execution on multiple nodes and chassis.
- Compile your code on rg-emu-dev using <memoryweb.h> and emu-cc.sh to target x86 execution. This will run the Cilk code and emulate any data allocations specified by the Lucata APIs.
- Simulate code on rg-emu-dev. Do debugging and initial verification here but note that simulation is slow! If you need to use a machine with more memory you can use hawksbill.crnch.gatech.edu
- Profile your code with the simulator for small input sets.
- Make a reservation on the Google Calendar for the Pathfinder to run jobs. We also use our Slack channel to reserve time on the Pathfinder
- Run your job on a single Pathfinder node (SN<0-7>. Verify its correctness.
- Run your job on a single Pathfinder chassis (8 nodes PF<0-1>).
- Run your job on multiple Pathfinder chassis (2 chassis).
Tutorials and Training¶
Please check out the recent PEARC21 tutorial for official training material for the Pathfinder systems. There are also some examples and related tools shared in a Github repo at https://github.gatech.edu/crnch-rg/emu-common. Please feel free to branch and fork as makes sense for your research.
Eric Hein has also contributed a nice micro-benchmark that uses serial and recursive spawn. Micro benchmark
- The GraphBLAS branch can be found here
- CilkPlus can also be run on CPU-based clusters. For more information on general CilkPlus check out the official website and other Cilk tutorials.
- See our Kokkos branch focused on CilkPlus and eventually on an Emu backend. For more information on Kokkos, check out their website, tutorials, and other documentation.