### Please give me feedback -
--- # Working on Somebody Else's Computer ### Will Styler - CSS Bootcamp --- ### Today's Plan - Working on somebody else's computer - Alternatives to bare metal running --- # You don't always want to use your own machine --- ### Why use your own machine? - You trust it (more) - You own it - It's cheaper, generally - You can use applications more easily --- ### Why use somebody else's machine? - It's cheaper - It has software you don't or can't - It's more powerful - You're being forced to - Data confidentiality - Proprietary software --- ### How do you do that? - Physical use - Hook up a mouse and keyboard and log in to the machine - ssh - `ssh wstyler@ssrde.ucsd.edu` - Opens a secure shell to the other computer - You get to use whatever shell you'd like - Remotely hosted software (like Datahub) - You send commands, it sends output - Graphics on your screen don't get sent, they get generated - VNC - Sends the 'screen image' from their computer to yours, and lets you move the mouse - Generally a bad plan unless you need to run a GUI app - Running Scheduled Jobs --- ## Running Software --- ### Software can be run in many ways - Interactive Running - Scripted running - Scheduled Running --- ### Interactive Running - The human is guiding the process - Often a window is open, the app is 'running' and waiting for you to do things - Jupyter/ArcGIS/Word is a good example of this - "Driving the software" --- ### Scripted Running - Some applications can be run as a 'batch command' - Input is provided ahead of time, in the form of a command or script - The user's only interaction is directing the script to run - "Set it up then let it run" - Output is directed to a file, or continues in the background --- ### Interactive Running Pros and Cons - Very easy to debug and use in real time - Variables are set and you can 'play around' - Easier for something you're building - Some software doesn't support batch running or scripting - Doing the thing takes the same amount of time each time you do it for new data - You become faster repeating tasks, but **you are the bottleneck** --- ### Scripted Running Pros and Cons - Setting up an automated script can take as along as doing the work manually - Can be run over and over with new data - You set it up, then go get lunch - Scripts are often portable, and run easily on any machine with the software installed - Scripts run with less 'overhead' (e.g. windows, graphics, etc) - Debugging is much more painful and needs other approaches - Nobody is helping you, you need to figure it out --- ### Scripted running works on remote machines - ssh into another machine, run your script, leave - Sometimes, it's required! --- ### Scheduled Running - Prepare a scripted code chunk ahead of time - Then let another program run the script at the designated moments - This can be locally (e.g. with `cron` or `systemd`) - This can be remotely, with a scheduling program --- ### Slurm and Compute Scheduling - "Hey server, here's a job, run it when you can" - You don't control when it starts and stops - You can request the resources you need - This is how many compute clusters run --- ``` #!/bin/bash #SBATCH --job-name=slurm-test # create a short name for your job #SBATCH --output=slurm-%A.%a.out # stdout file #SBATCH --error=slurm-%A.%a.err # stderr file #SBATCH --nodes=1 # node count #SBATCH --ntasks=1 # total number of tasks across all nodes #SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks) #SBATCH --mem-per-cpu=16G # memory per cpu-core (4G is default) #SBATCH --gres=gpu:1 # number of gpus per node #SBATCH --time=20:00:00 # total run time limit (HH:MM:SS) #SBATCH --mail-type=begin # send email when job begins #SBATCH --mail-type=end # send email when job ends #SBATCH --mail-user=you@ucsd.edu conda activate tf2-gpu python3 mython.py ``` --- ## What problems come about when working remotely? --- ### Remote Computer Problem 1: Security - Don't save anything sensitive 'to the cloud' - Make sure you have strong guarantees about your data's integrity before doing anything 'cloud' - If you must transfer files online, only transfer encrypted files - Ask yourself if you trust the administrators on the machine - Are you connecting securely to the machine? - SSH is secure, but you shouldn't do the connection from public wifi --- ### Remote Computer Problem 2: Data Storage and Backup - Data will need to be sent to, and retrieved from the remote server - Unless it 'lives' there - How will you do this securely? - How can you ensure that the data there is backed up? - What guarantees does the server provider offer? - Do you believe them? --- ### Remote Computer Problem 3: Storage Costs - You pay for compute, and pay separately for storage - You may pay for the ingress/egress (that is, sending data to and from) - Just keeping a large dataset online can be pricy - Servers you 'own' may have more limited storage space for you to use --- ### Remote Computer Problem 4: Connectivity - You can't count on your internet connection staying up - Wifi disconnections, VPN logouts, Internet Outages, etc. - Some shells are designed for this (e.g. [mosh](https://mosh.org/)) - You might not want to keep your computer connected for the duration of a non-scheduler-based 20 hour compute - This is where tools like [tmux](https://brainhack-princeton.github.io/handbook/content_pages/hack_pages/tmux.html) can be used to log in, start a session, disconnect, and reconnect to it later - Slow connections mean slow connections - Sending text back and forth over SSH is very fast, but latency is rough --- ### Remote Computer Problem 5: Upload and Download Integrity - How do you know that the file you uploaded is the same? - Hashing - How do you get a large number of files copied over in one place? - Compression! --- ### Remote Computer Problem 6: Neighbors - You may be the only person logged in, or you may be one of 20 people hammering a 24 core machine - Your scheduled job may take 10 minutes to start, or 10 hours, or 10 days - Not all systems will enforce user storage limits - So, your neighbor might write 1.2 TB of "placeholder data FIXME" to a file before your job runs, blocking you from writing out - 'Cloud' providers will sell you dedicated cores and RAM and storage that is guaranteed 'yours' - "VPS" or 'Virtual Private Server' - This is 'your own machine', with nobody else on it - You can choose the number of cores, amount of RAM, disk space, and adjust them in real time --- ### ... but wait... - How can they sell you a machine with an arbitrary number of cores, amount of RAM, disk space, and let you change that at any given moment? --- ## What is a 'machine', anyways? --- ### A Computer is, practically, five things: - Hardware - An operating system - The software environment - A file system - A user interface --- ### Your computer has one of each of these things - It has *only one* of each of hardware, OS, environment, files, and a user interface - Maybe multiple file systems --- ### There can be more or less than one of each - Virtualization - "Use one piece of hardware to spawn many OSes with many environments, filesystems, and interfaces" - Containerization - "Use one set of hardware and one OS to run many environments" - Sometimes multiple file systems, sometimes a unified system - Docker, Podman, Kubernetes, Distrobox... - Headless running - "We don't need a user interface at all on this machine, it can just respond to application requests" - Storage Area Networks (SANs) - "This computer doesn't need storage, it can just use the storage from another computer" --- ### Why use virtualization? - "Use one piece of hardware to spawn many OSes with many environments, filesystems, and interfaces" --- ### Why use containerization? - "Use one set of hardware and OS to run many environments" --- ### Why run a computer headless? - "We don't need a user interface at all on this machine, it can just respond to application requests" --- ### Why use a Storage area network? - "This computer doesn't need storage, it can just use the storage from another computer" --- ### Why use a boring, bare-metal install on a computer? --- ### Wrapping Up - There are many ways to work on somebody else's computer - Interactive running with SSH - Scripts - Scheduled Jobs - There are problems associated with remote work - Not all "computers" are physical machines, and there are many ways to run code