Project

General

Profile

The FBP / Pd divide

Added by jcw almost 3 years ago

I’m really struggling with the difference between FBP’s async approach, where channels can have a capacity > 0, and Pd’s approach, where a wire is really just a link from caller to callee. Pondering on this, I think the difference is that FBP supports “back pressure”, i.e. if a message isn’t read, it’ll start blocking the sender, and this can cascade all the way back to the origin (as well as cause deadlocks if there are design errors). It’s like hitting ^S in the terminal: you stop the process performing output, and it resumes when typing ^Q (or like using “less” to view the first page). In Pd, there is no such thing: if the receiver can’t deal with the messages thrown at it, the entire thing slows down as it reaches 100% cpu, trying to process all the data and perform all the calls - and gets out of sync with real time.

To put it differently: FBP is like having lots of tiny processes, all looping to get messages, do something with them, and spewing them (or other messages) out again. Pd is more like a big call tree, where the first message coming in leads to lots of nested calls to the downstream objects, following all the connections (in a fully specified order). FBP is lots of little threads, each with their own stack, and able to automatically take advantage of multi-core parallelism, whereas Pd is one (potentially deeply nested) call stack to process each incoming message from start to finish.

FBP seems more powerful, Pd seems simpler. FBP is individualistic, Pd is orchestrated. FBP supports messages from any input to drive its next action, Pd has a “hot” input and the rest is cold (last value stored, but does not trigger processing).

It’s a very fundamental design choice. I’ve cross-posted this on the Tosqa forum as well, to try and get more insight in what the implications are - for HM as well as TQ.


Replies (12)

RE: The FBP / Pd divide - Added by jcw almost 3 years ago

Here’s an idea for bringing FBP and Pd closer together: stick with the current FBP model, with its flow.Input and flow.Output pins, but also add support for named parameters, using Go’s expvar package. Declaring say a “Baud expvar.Int” as part of a gadget would create a special type of input pin, which can hold an integer value. Any output pin or feed can be connected to it, with Flow taking care of storing the values sent over those wires into the variable, but without triggering or affecting the gadget’s channel input activity.

So normal inputs in Flow are still used to drive the processing loop in each gadget, but now a gadget can access its “parameters” - and these can be set via other (immediate) connections from the outside. The big change is that outputs tied to such named parameters never block. Internally, Flow would route all such wires to the parent circuit a separate goroutine, which then immediately sets those parameters when a message is sent, taking care of simple type conversions, if needed.

I haven’t implemented this yet, just thinking out loud. I hope this description makes sense…

PS. I’ve changed the dev branch of Flow to use []interface{} as type for flow.Message, i.e. a list of interfaces instead of an interface. Tags are now no longer special types, but lists of which the first item is a string.

RE: The FBP / Pd divide - Added by tve almost 3 years ago

Isn’t []interface{} the worst of all worlds? You get a completely non-self-documenting structure (what’s msg[2] again?) and you impose structure over interface{} ? At least with a map you could say that msg[‘type’] is the type, msg[‘value’] is the default “slot” for a value, etc.

WRT buffering there is quite big performance implication. Take a measurement of the number of context (goroutine) switches that happen when the jeeboot hex files are loaded. It’s quite insane. I had a print statement in there at some point and I quickly stopped that experiment.

RE: The FBP / Pd divide - Added by lightbulb almost 3 years ago

Internet at last!…

@jcw,

What’s the deal with puredata (pd) references. RBTL it seems you like the
gui at least as a reference for circuit edit? But you also ref the p2p
nature of messages.? You then go on to say “bring Fbp and pd closer
Together”.
Have u tried running pd on a low end platform. It did not work well for my
daughter…but admittedly I have not spent much time seeing how she is
getting on with it. I do know she runs it on a high end 2.3g dual core
laptop…I’ll discuss this for more insight when I am home.

Whilst I have been on hols, I have thought about the gadget /channels issue
myself, and even the param/msg loops issues.
I have a few ideas, but the main one is related to moving message routing
directly to a circuit where a circuit is a != to a gadget interface wise,
as is case when I left.

This would make circuit more complex, but gadgets simpler, and channel
buffers would be less of an obstacle. Circuits could easily be
paused/resumed and gadgets dynamically inserted. Circuits would only have 1
input/output but internally can multiplex gadget pins.
I’ll try examples when I get back and settled, and post something more
concrete, as right now am doing this post on phone…

Ps: I see your TQ project, but is that community aligned with HM. What’s
the common ground other than flow and a need for a gui?
Would be good to introduce us all to each other in some way perhaps.?

Looking fwd to seeing where things are at when I get home.

Lightbulb

On 1 Jul 2014 21:46, redmine@jeelabs.net wrote:

RE: The FBP / Pd divide - Added by jcw almost 3 years ago

A message is like a function call: first arg is type, the rest are arguments. So msg[0] could be "map" and msg[1] could be a map[string]interface{}, for example. Using the “sequence” as basic type makes it trivial to wrap a message by inserting stuff at the front - very useful for routing and highly generic (similar to ZeroMQ).

By using []interface{}, I can attach methods, so msg.Tag(), etc can be defined - this wasn’t possible before. This means all messages can now have standard behaviour, and have a bunch of msg.Method(...) definitions for use in all gadgets.

Another approach would be to switch to interface{ ... Method(...) ...} as Message, i.e. opaque data, but I think the sequence model is actually more natural and more concise (not just in Go, also in JavaScript).

As for buffering: you’re intercepting calls, so yes, there are a lot of them - that’s like inserting a print statement in each function. The difference between a 1) direct call and 2) the channel + goroutine approach, is 1) a function call w/ stack push and pop of stuff vs 2) a mutex-protected channel queue insertion + stack switch. On a fast Mac, I got 16 million messages per second over a channel with capacity 0, i.e. synchronous execution. I don’t see Go’s channel + goroutine overhead as being more of an issue than say a simple function call in a scripting language. Even for a highly optimised JavaScript + v8 JIT setup. Let’s argue over practical results, rather than assumed effects taken out of context. Besides, really fast stuff shouldn’t be done across channels, i.e. we wouldn’t want to program substantial algorithms by splitting each statement into a gadget.

Having said that, I’m investigating the idea of moving away from chan<- and <-chan as exposed primitive actions in gadgets, maybe pin.Send(...) and pin.Recv(...) lead to better hiding of the underlying mechanism (since the use of channels can then be decided case by case).

Have a look at Pure Data (especially Pd-extended, which is available as pre-built app for several platforms). Max/MSP is similar, but pricey - it has a “gen~” add-on, which takes flow diagrams, turns them into C++ code, compiles them, and loads it back in as dynamically loaded extension. This means hot spots can be turned into very efficient code. I don’t want to break the pure-Go aspect of the Flow core for various reasons, but a pipe to an external process could be used to tie such things in - a quick test gave me 8 µs latency and 400 Mb/sec for messages over a pipe in Linux (on a “modest” 1.4 GHz Core 2 Duo), which means fast integration is easily in range.

Not that speed is anywhere near a limiting issue for HouseMon, even on a RasPi. We’re talking at most 100 events per second in the most extreme case for home monitoring and automation, I expect. The only reason I’m mentioning all this, is that I’d like to aim for quite a bit wider application scope for the Flow “core” engine. There’s the Tosqa CNC side of things, as well as a “Bento Lab” future project :)

RE: The FBP / Pd divide - Added by jcw almost 3 years ago

My previous post was for tve. This one's forlightbulb…

Can we please stop focusing on performance? In HouseMon, it’s a non-issue, even with a RasPi as host server. I’ve been trying out the 64-bit version of Pd 0.43.4 on an 11" MBA w/ 1.4 GHz Core 2 Duo. Works fine (though app startup takes a few seconds).

I’ve also been thinking about routing everything through the parent circuit, but it will lead to more complex situations in the case of blocking and back-pressure, I think. The non-channel “parameter” input idea described above might be a way to get the best of FBP and Pd in one. Still trying to think things through at this stage.

The commonality between HouseMon and Tosqa is, IMO: real-time physical computing, similar hardware environments, with bus real-time constraints in the millisecond range for TQ + CAN bus, and in the sub-second range for HM + wireless. For both, I see dataflow design as a really good fit - to modularise the application (it’s almost like introducing functional programming without having to think that way), but also as a way to make the entire system easily configurable and extensible for non-programmers. Whenever I show the (still rudimentary) circuit editor to people, the coin drops - they immediately can see themselves adding stuff. But although the CE is neat, it’s really just lipstick - the real beef is the “soft” real-time we can get from all this, on both the host side and the user interface side in the browser. Ultimately, I expect that the host (and later also the browser and the embedded firmware) will become nothing but a generic “main” with a set of built-in gadgets and lots of circuits built on top.

In a nutshell: Tosqa is modular hardware and software for machine control. The recent demo (for mostly technical hobbyists, but not necessarily software geeks like you and I) used an adapted CNC router to draw the Tosqa logo on a sheet of paper. Nothing earth shattering, but it did demonstrate some of the core ideas of it all.

RE: The FBP / Pd divide - Added by lightbulb almost 3 years ago

@jcw,

I was not “focusing” on performance, just mentioning because my daughter
originally ran pd on a 900mhz atom cpu and it simply could not cope with
what she was asking of it at the time, over a year ago not so thing may
have changed, but she was new to pd then, her music friends set it up for
her, but moved it onto her full laptop.

Anyway, that’s sidetrack…

Using circuit as a full blown scheduler is certainly more complex, but
perhaps only way in end. More experiments your end and mine I guess.

There are lots thing going on in flow world right now, can’t keep up, would
not want to reinvent too many wheels. Think it important to support Fbp
network proto from this point on.
What are your thoughts about adopting some of these open “standards” or
even helping shape them….

On 2 Jul 2014 15:08, redmine@jeelabs.net wrote:

RE: The FBP / Pd divide - Added by jcw almost 3 years ago

The current commit (010b2547082352b37b282316d229d1873671af68) of the “dev” branch of Flow might be a good place to fork if you want try out some new designs - it’s all test-driven and has very little “fluff”. This version uses the following definition as message type:

type Message []interface{}

It’s just one of many choices, as @tve pointed out, but I like it best, so far - with the convention that the first item is normally a string describing the message type, object type, or command name - depending on how you want to interpret it.

This choice leads to two special cases: nil and Message{}, which are distinguishable from one another, and can both be sent over a channel as currently defined in Flow. The second case is very much like Pd’s “bang”, i.e. a message with no content. The “nil” case could be used to send an EOF-like signal over the wire, i.e. perhaps a request to close it, or as a marker to precede some out-of-band data sent right after it. Both also have representations in JSON: null and [], respectively, so they can also travel over a WebSocket.

RE: The FBP / Pd divide - Added by jcw almost 3 years ago

As my mentor, Andy Tanenbaum, used to say: http://www.quotationspage.com/quote/473.html :)

My whole adventure has been about chasing the things which already exist and work, i.e. FBP and now Pd / Max. I’m all for idea re-use, and all the benefits you get from learning how others do things, why they do things that way, and how it works out in practice. I’m often less inclined to re-use implementations, though - software development tools and practices evolve in major ways over time. And even with the same tools, I don’t always agree with the fundamental choices made, such as with “goflow”, for example.

In the case of NoFlo, I’m quite sceptical. I think they’ve completely broken the basic premise of the original FBP design (which is archaic in its implementation IMNHSO, but illuminating in its conceptual design and choices made) - using an event-based system like Node.js with callbacks and re-using that same approach in the browser, you lose the essence of the “encapsulated little-process execution state” behaviour of FBP’s original components, if you ask me.

Maybe we can tie into existing developments, but I suspect that NoFlo’s editor won’t fit the data structures we end up with in Flow. Same for the FBP parser: it makes assumptions which might conflict with the design coming out of Flow.

The benefit of rolling your own, while trying to find as many giant’s shoulders as possible to stand on, is that you get a much deeper insight in the trade-offs. Everything I’ve learned since starting on FW, JB, HM, and TQ has increased my understanding.

There’s a tradeoff w.r.t. participating in other projects (including this one!), in that things move in a certain direction, and people tend to have skills and preferences which make certain evolution paths more likely than others. Right now, I’m not in a hurry to get anywhere soon, although very impatient - I definitely want to end up with practical software, but it’ll just have to take as long as it takes. The progression is very clear to me: flow-based is a given, so is Go for me by now, and the gadget/circuit terminology seems to be useful enough to keep.

Perhaps there’s a project out there which would be a great fit, but so far I haven’t found it. NoFlow w/ Node.js is out for the reasons mentioned above and too resource-heavy low-end hosts I am aiming for, and Pd / Max is not web-based. There’s a “WebPd” project on GitHub, but I’m not really willing to completely abandon the FBP model and Go’s channel + goroutine architecture.

It’s clearly a trade-off. By insisting on a low-end embedded Linux as server and a web-based browser, this immediately rules out a bunch of options. I don’t quite see how to align these choices with what’s out there. Let’s also not ignore the fact that this is all voluntary work, with fun being quite an essential part of the equation!

RE: The FBP / Pd divide - Added by lightbulb almost 3 years ago

@jcw,

Funny as it seems, it’s sort of comforting to know, especially last para.

It concentrates ones mind to know that a tree has started to grow some
roots. Let’s hope that enough sun shines in this part of the forest.

Lightbulb
On 2 Jul 2014 16:42, redmine@jeelabs.net wrote:

RE: The FBP / Pd divide - Added by jcw almost 3 years ago

Has anyone looked into StreamTools It’s ridiculously similar to what I’m trying to do with Flow.

RE: The FBP / Pd divide - Added by jcw almost 3 years ago

Ok, while abroad on a trip for a few days, I’ve had some time to ponder on Flow, the size of the universe, and a few other puzzles. It’s been refreshing to try and gain some “altitude” on all that.

FBP is more general than Pd, I think. It’s also bound to be slightly less efficient, when implemented with channels and goroutines. But the key is that it can do more, i.e. I now think that everything Pd does should be doable in an FBP-like design as well.

As for speed: it’s most probably irrelevant for HouseMon, and almost irrelevant for Tosqa. Nevertheless, there’s an excellent way out: in Flow 0.10, I’m going to hide the channel mechanism behind an interface. As mentioned before, this means channel sends will be replaced with “Send()” method calls again, as they were before. Similar with “Recv()” replacing the current “for … range …” notation. Since goroutines were already hidden beneath the surface of Flow, this means it will be up to Flow’s internals to decide whether to use channels+goroutines, or direct calls. This decision could be different for each connection.

Outputs are going to be expanded to allow sending to multiple inputs. My plan is to make deep copies of all “[]interface{}” and “map[string]interface{}” messages, i.e. copying the structure of complex messages, without copying the individual values. No copies will be made when only a single input has been connected. This will allow gadgets to take a message and extend or reduce pieces inside that message, without affecting the processing of the same (well, copied) message in other gadgets.

Also on the horizon, again as described before: special inputs representing values which don’t trigger further processing (Pd’s “cold inputs”). These simplify passing in parameters, config settings, etc. The benefit here is that the reception of these values requires no extra coding in the “Run()” loop of a gadget, with as nice extra that they also prevent deadlocks.

Since Send and Recv now hide channel logic, they can also add checks for termination, which will help to cleanly shut down a circuit again (e.g. a websocket close, or a device being unplugged).

Wires have always had a “capacity” until now, i.e. the number of messages buffer for that wire - but IMO this can be dropped and replaced with a separate buffering gadget. The reality is that unbuffered wires are usually fine, buffering is only rarely needed. By making the mechanism explicit, I think the designer of a circuit will be more inclined to carefully make such a choice.

W.r.t. deadlocks, I will be adding some sort of “impatient pipe” gadget, i.e. a buffer with a timeout for sending packets out. By inserting this in a pipeline, you can then explicitly prevent a deadlock when you think this is needed.

Lastly, I’m going to try turning circuits into interpreters, i.e. to associate a channel with each of them, over which they can be commanded to make changes to the circuit, such as adding/removing gadgets, wires, etc. Another way to put this is that circuits will no longer depend on a special API to be configured (as it is now), but simply have yet another input pin, to which they listen for configuration commands. This will simplify the way a circuit editor in the browser can act on a live setup.

With all these changes, it should become possible to implement gadgets which act as pure “FBP components”, or as pure “Pd objects”, or any mix in between. And even though channels + goroutines are very efficient in Go, I expect that they can even be replaced by direct calls in some of the simpler circuits.

Lots of plans, but with a bit of luck, it’ll reduce complexity and reduce the amount of code needed in the core flow engine.

RE: The FBP / Pd divide - Added by jcw over 2 years ago

I’ve been working my way into a new solution for all this over the past month or so.
Current status, mid-January 2015:

  • the jcw/housemon repository on github has the last FBP-based code, it hasn’t changed for over half a year
  • a new flow design based more on Pd than FBP, is being created in the jeelabs/jet respository
  • the new “JET/Flow” is designed to supported dynamic circuit changes from day one, which was where the previous FBP-based design got stuck
  • it’s all about data flow, circuit-/gadget-based development, and ultimately also a visual circuit editor, I hope
  • my goal is to use for HouseMon, but the code is months away from being of any use - other than for developers interested in this Go-based stuff

So, yeah, patience. Some designs take a lot of time to unfold. Then again: easy problems are perhaps predictable, but a lot less fun…

    (1-12/12)