Distributed Workflows (dwf) 🔼
Description 🔼
Short: A web of processess executing in different cpus (different comp devices).
In a network of computer nodes it happens that a user has access on various computational resources and data that can reside in various network nodes. A user usually wants to implements his/her workflows. A user workflow could need access to various network resources. In that context i speak of the remote workflow integration problem.
So a distributed workflow could be abstracted as a interconnected graph of processes , dataobjects and their interconnections. Some processes have I/O to users and propably sensors (ofcourse process/process also entail IO). Also some processes have IO to entities other than the involved processes.
Depicting a distributed workflow 🔼
An example graph of a distributed workflow :
r---p ---u
/ /
d==p------p___e
\ /
\ p___e
\ /
(a--s)__ p---------p===d
We assume that each p is in a different host. A user can interact with certain processes in certain hosts only. Some processes have links to various peripherals and links to data storage devices and finally for the sake of generality some processes could link with IoT or Edge devices (usually equiped with sensors and actuators).
Relocating a distributed workflow. 🔼
That means to transfer some parts of a user's workflow or transfer a whole workflow from one compnode to anothe compnode. That could be more diffucult that it sounds. All User's workflows should function after the relocation as before.
Distributed workflows(dwf) examples. 🔼
The base (non distributive workflow) 🔼
A user sits in a host(with a display) using a shell. Lets assume that there are processes that have a human shell intergrated (sh-p). So a user interacts with a hp:
____host A______
u----con-|----- sh-p |
|____________CPU|
Basic minimal dwf (transfer data) 🔼
Description 🔼
____host A_____ l _____ host B______
u<--con-|--shp -----lp-|---------|-lp--p----d |
|__________CPU_| |____________CPU__|
User (u) access data (d) that reside into remote host B. A program must run in host B that implements the comm-layers necessiraly to transfer data over a link between hosts.
A shp is a shell-process. A process that takes input from the user and starts other processes..
So a inter-host link and the net-protocol related process that run at the edges of the link are a minimul prerequisite to establish a minimal dw.
Transfering data could or could not be processed further in the hostA. Buts since user wanted that data she will use them in one of his workflows. But we are interested more on in-hosts machine workflows. I mean a user could read a remote file and learn something.
Transferring data we could then save them. But that happens localy. Lets focus on the distributed part.
practice it. 🔼
Connect to a remote host and read a file that resides there. Congratulations. You practiced your first distributed workflow.
Use rlogin,telnet,ssh.
remote shell 🔼
Now what happens with an X app (xp) and X11 forwading ? An X client is a process paired with a graphical shell (xp) that can 'talk' the X protocol towards a display server. Xcon is a x console.
_____Xhost A____
A. u - Xcon-|--X-------xp |
|____________ CPU|
____host A____ _____ host B______
B. u<-Xcon-|--X ----------|---------|-------xp |
|__________CPU_| |____________CPU__|
____host A____ _____ host B______
C. u<-Xcon-|--X ----------|---------|-------xp--X |--Xcon
|__________CPU_| |____________CPU__|
The difference between the first and the the second case is that xp and X are on different computer nodes and the host B is not connected to an Xconsole. One question could be: Why dont we install host B/xp in host A ? One reason could be that A/CPU/uarch/ISA != B/CPU/uarch/ISA. So we need B/xp and it has to run in HostB. And is host B there is no X or Xcon or both. So we must accept as facts. There is X. There is a device Xconsole . And there are computer nodes that dont have either or both.Why? Maybe there are mainframes with shared resources.
But what about case C. A user could sit in hostB. Why go to host A ? One user can be in near one Host. Now if there is no resource of our workflow in host B there is not distributed workflow.
1. A users has a distributed workflow (distr-wf) when some of the computational resources she needs are in a different host (compnode) that the one she is sitting. But what consitutes a comp-resource is not a trivial thing to state. Lets say that user access host A and edit a file in host B .Isnt that a so simple dist-wf that could be trasformed into a local-wf by simple transfering the file ? That depends in various things. What the file containts. If other files in host B refer to fileA. If processes in B use fileA . In short it depends in the set of workflows in general that user has on hostB.
Editing a file that resides in another host.
u -- hp _______ p -- d
Ofcourse we could transfer the file . But that also is a distributed workflow. Transferring data over a link because you need it. A most basic distributed workflow is then transfering a file. Another is editing a file that is remote.
conflicts in one user's workflow 🔼
Trying to unfold our workflow we may access a data file through different processes executing in different hosts. Using different processes-instances of the same program- to access and change one data object seems kind of counterintuitive. Imagine hitting a nail with two hammers that functionaly are the same. Why we would want to do that? It would make sense to nail it faster perhaps? But the analogy doesnt present to us clearly the problem that will be created. Some times programs like emacs works on buffers and thus we dont see immediately the changes created by others processes on the data.
In X11 forwarding we tend to use our local host as a terminal to a remote host. That relation assumes a local host where no usefull resources reside (relating to our workflows). But starting from the PC era (1980s) and the proliferation of networking it'is common to have resources both in our local and remote hosts.
So in that case we need a way for a process local or remote to be able to access easily local or remote resources and a human user to be able to controls his/her workflow's process by his current host even if those process reside in another remote host.
So ideally for example we would like to see an X remote app in our local screen and do drag and drop to local X apps. Or establish pipelines mixing remote and local processes.
emacs remote file access 🔼
Local/emacs does some editing and we save it. Remote/emacs works on a previous buffer state and if we save we can destroy the local changes.
rules 🔼
I think in a distributed workflow(dwf) ,of one user, we shouldnt access the same data object from more than one program instances executing in different nodes. To do that we must have programs able to access data everywhere in a network. Or we must be able to transfer data where the program is executing.
tools and utilities 🔼
telnet 🔼
In the context of a dwf model what telnet does? Telnet offers access to a remote compnode . So it could be used attached to one compnode thus not being part of dwf. In order to establish-setup a dwf we need to transfer data between compnodes (we always assume that one user has access to resources she owns on both compnodes). We could say that if we start rlogin from hostA and we
rlogin 🔼
ftp 🔼
ssh 🔼
sftp 🔼
scp 🔼
X11 forwarding 🔼
emacs trump 🔼
When you are prompted to open a file or directory in Emacs, you can instead add an access method to the directory path.
Behind the scenes, invisible to the user, Emacs establishes a connection to the server and allows you to access files as though they were located on your local computer.
In order to open a Tramp buffer just type C-x C-f and append the "ssh:" method to a directory path:
/<ssh:root@example.com:~/>
xpra 🔼
xpra / installation 🔼
(local) $ sudo apt-get install xpra (remote) $ sudo apt-get install xpra
xpra / workflow 🔼
Why would we need xpra?
In distributed workflows we assume that either a process or data or both or general a part of our workflow is not in our local host.
My current understanding of how xpra is used is that assuming it's installed in all devuan&linux hosts in a network we can start an X program in a certain host without yet appearing anywhere and then from any host we want (even from the same host where the X app runs) we can attach to it , see it , and work with it. But that can be done from only one host. (i think!). We can detach from it and attach from another host.
When we start remotely an X program and we see it locally can we also see it if we sit infront of the remote host?
xpra / start a remote X program 🔼
$ xpra start :100 --start=fooXclient
- xpra start : This part starts the xpra server.
- :100 : This option specifies the display number. If you don't specify a display number, Xpra automatically uses 1.
- --start=Xclientfoo : Starts a program on the remote host specified after the equal sign. If no program is specified, Xpra won't start any application.
xpra / connect from local host to the remote X program. 🔼
$ xpra attach ssh://chomwitt@192.168.1.86/100
xpra / view all the running Xpra instances 🔼
view all the running Xpra instances :
$ xpra list
xpra / see a X app from the host where xpra runs. 🔼
Can we do that ?
xpra / start a remote X program without starting a remote xpra server 🔼
$ xpra start ssh:chomwitt@192.168.1.86 --exit-with-children --start-child="xfe"