Recent

Author Topic: Who'd be interested in network subscribed heap memory? Absurd I know...  (Read 4288 times)

kveroneau

  • Full Member
  • ***
  • Posts: 119
This post isn't about anything very specific to FreePascal, it's rather a general computer science question which I would like some opinions on.  I wasn't entirely sure where to put it, as although I will be discussing a potential server application written in FreePascal, the question is not related to the networking or web programming of FPC, so I thought this was the best place to put the post.

I recently had the idea to write a server application which allows various client applications to share a heap of memory through a subscription model.  Similar to a message queue, but rather than sharing structured messages, a page of memory is shared between various applications over the network.  This page of memory in the application is then treated like a regular page of memory, but it's updated and synced with all the other client applications subscribed to the same page on the server.  This allows programs to easily share RAW data structures in memory with extremely low overhead, think of uses in embedded devices for example, where processing power might not be enough to parse other data types.  With this new server, the embedded device grabs the current state of the heap from the server, and subscribes to new updates.  Once a new update is ready, the device locks the heap pointer on it's side while it's transferred from the server replacing the existing memory in the embedded application.  While most of the world has moved towards everything being encapsulated in the HTTP protocol these days with REST, this server I am writing uses an extremely lightweight persistent TCP protocol stream.  For embedded programming this is much more optimal, but it can also work on servers as well, such as in a microservices environment where data moves around very fast between multiple applications.  And of course, I am currently developing both the server software, and the client library in FreePascal.

I am currently debating on if this project/idea will actually be useful to a wider audience or not.  This is why I am posting this today.  I'd like to know the community here's opinion on such a project, and if anybody else might find it useful.  If so, I have plans on publishing it on GitHub, still have yet to decide on a license for it.  The initial prototype server and pascal client unit are almost ready, not production ready, but ready to be shown and tested by a wider audience if anybody is indeed interested on this concept for their own projects.

Currently my focuses with this project are:
  • Authentication, and limiting access to specific pages based on this authentication.  So say, this client can both read/write to these pages, but can only read from that one.
  • Over the wire encryption, this is where I'd like a bit more community support to ensure it's secure enough.
  • Subscribing to pages, and having the clients notified over their persistent TCP connection of any updates.
  • Developing a management front-end for the server in Lazarus to manage pages, connections, and authentication.  Perhaps the ability to even view a page in a hex editor/viewer.
  • Currently the server supports what I call Application identifiers, and Type identifiers.  This can be used to search for a specific page from the client without knowing the page number.  So, if you mark a page with an AppId of $34, it can be searched for via a look-up table on the server.
  • Pages can also be given a title currently as well, this cannot be searched, but is useful for identifying additional metadata about that specific page in a management front-end tool that can list all the pages from the server.

Which leads me to my last paragraph.  How did I come up with such an absurd idea?  Well, I've always had a weird fascination with RAW memory, and the memory cards used in the older video game consoles.  I originally created my own version of a memory card system in FreePascal, where I can easily create a file on my disk separated in distinct blocks.  The block size is entirely adjustable starting at 512 bytes as the smallest size.  After playing with this code in a local non-network program, I wondered how this could scale in a network, and so this new project was born.  I dubbed the project "Memory Card Server" and started work rather quickly on making these memory cards network accessible.  The server supports multiple distinct "memory card" files with different block sizes depending on the client application requirements.  Each memory card with it's own block-size is then divided even further into blocks of memory based on the formatted block-size, with Block 0 being reserved for the custom header and I guess a FAT table which hold the titles, AppId, and TypeId data.

MarkMLl

  • Hero Member
  • *****
  • Posts: 6646
Re: Who'd be interested in network subscribed heap memory? Absurd I know...
« Reply #1 on: September 18, 2021, 10:26:40 pm »
I'll bite. If you propose to serve an image of raw memory, how do you intend to describe what's in it particularly if you have an heterogeneous population of big- and little-endian systems?

CORBA and other RPC-like systems handle it by having a complex IDL, but such things tend to be... complex. I was on the margins of a project in the late '80s where a company in the UK's West Country tried to write its own in-house business management system based on the owner's understanding of Smalltalk: it required multiple class servers to provide formatting information and Did Not Go Well.

I'd conclude by saying that  the words "TCP" and "lightweight" are rarely seen in the same sentence. If you really want to turn your talents to something useful how about a cut-down version of SCTP which is sufficiently complete to interwork with PC stacks but can also be implemented on e.g. an Arduino?

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

egsuh

  • Hero Member
  • *****
  • Posts: 1266
Re: Who'd be interested in network subscribed heap memory? Absurd I know...
« Reply #2 on: September 19, 2021, 02:49:51 am »
I'm not quite sure I'm understanding you correctly, but I think this will have very wide applicability.
The simplest example would be web pages. Currently most of them are saved as files on HDD of web server, and it is read into server's memory whenever there is a request. But if applications over network may access a shared memory of the server, the file is not necessary to be loaded for every request.

Or think about a fixed in-memory dataset, like TBufDataSet of Free Pascal. If all applications may access this memory (assuming that they will read only), this will reduce calling DB server. I can write e.g. webbrowser program assuming there is an active dataset within the webbrowser.

Forgive me if I'm misunderstanding your idea wholy. But the first concept of your explanation is like this for me (for now^^).

kveroneau

  • Full Member
  • ***
  • Posts: 119
Re: Who'd be interested in network subscribed heap memory? Absurd I know...
« Reply #3 on: September 20, 2021, 02:46:24 am »
I'll bite. If you propose to serve an image of raw memory, how do you intend to describe what's in it particularly if you have an heterogeneous population of big- and little-endian systems?
Thank you for bringing that up to me actually, I don't know why, but I completely forgot about big and little endianess of processors.  This is a very valid issue which will make sharing raw memory structures more difficult among incompatible systems.  I'm glad you pointed this out now to me, or during my testing this would have really caused some interesting issues with data consistency.  I guess the initial version of this until a solid solution is developed will limit it's use of both the server/client to have the same processor endianness.

I am thinking that I will need to have the server communicate it's endianness to the client during the handshake phase, and disconnect if it's incompatible.  The other solution is to have a server option which enforces a specific endian.  Also, the server never actually needs to know the endian of the data it's storing, as it's not really accessing it itself, only the clients connecting will care.  So a singe x64 server with many embedded devices of the same chip/endian will be perfectly fine.  But regardless, this is a very valid point, and I will be testing this more.

I'm not quite sure I'm understanding you correctly, but I think this will have very wide applicability.
The simplest example would be web pages. Currently most of them are saved as files on HDD of web server, and it is read into server's memory whenever there is a request. But if applications over network may access a shared memory of the server, the file is not necessary to be loaded for every request.

Or think about a fixed in-memory dataset, like TBufDataSet of Free Pascal. If all applications may access this memory (assuming that they will read only), this will reduce calling DB server. I can write e.g. webbrowser program assuming there is an active dataset within the webbrowser.

Forgive me if I'm misunderstanding your idea wholy. But the first concept of your explanation is like this for me (for now^^).
No, I believe you understood it very well.  The FreePascal unit I write for client applications uses TStream to move data around, so any Lazarus or FPC component that uses TStream should be-able to sync data with the server application with little issues.

The idea is that the developer would connect to the server using the client class, authenticate, and request which memory pages the server is offering they will subscribe to.  The client class will have an event which can be assigned like any other "OnYada(sender: TObject; ...);" type event procedure.  Whenever the memory page has been altered by another client connected to the server, an event is fired to each client subscribed to that memory page and in turn your client application can handle it in any way it wants, usually by pulling the latest page from the server, and updating any data which was referencing it.

I am actually planning on making an example Lazarus form application which can sync to this server, when one form is changed, all the forms connected to the server also change.  I will be showing this concept in an upcoming YouTube video to better explain it.  So look forward to eventually seeing that.  It will be the same client application running multiple copies is what I mean by all forms connected to the server will change, the data being synced has to be the same obviously.

kveroneau

  • Full Member
  • ***
  • Posts: 119
Re: Who'd be interested in network subscribed heap memory? Absurd I know...
« Reply #4 on: September 21, 2021, 01:35:34 am »
Video demo of two client applications written in Lazarus syncing a basic "record" custom type using the network subscribed heap memory.

https://youtu.be/sdmfpWPVj8I

Video showing off the source code(a little over half an hour): https://youtu.be/BKq4yUP-9pU
« Last Edit: September 21, 2021, 02:37:26 am by kveroneau »

devEric69

  • Hero Member
  • *****
  • Posts: 648
Re: Who'd be interested in network subscribed heap memory? Absurd I know...
« Reply #5 on: September 21, 2021, 09:20:36 am »
I am currently debating on if this project/idea will actually be useful to a wider audience or not.  This is why I am posting this today.  I'd like to know the community here's opinion on such a project, and if anybody else might find it useful.  If so, I have plans on publishing it on GitHub, still have yet to decide on a license for it.  The initial prototype server and pascal client unit are almost ready, not production ready, but ready to be shown and tested by a wider audience if anybody is indeed interested on this concept for their own projects.


Hello @kveroneau,

I would see a correlation with some financial charting servers (the tools that draw things that move all the time). The latter - basically - retrieve (H,L,O,C) data from a subscribed exchange, with sometimes 2 or 3 other fields maximum. So, it's originally some very structured records that arrive in memory to a graphics server.

Afterwards, typically, these graphical servers allow calculations to be made (well, the ones I've studied a bit, have obviously an C-API, with a header file). This is where I would see an application to your server: to be able to distribute those raw data or even\better calculations from this raw data, distributed from a graphical server machine - which could then be called a front-end server, from the point of view of your server - towards other client machines on the network (hoping to be understood).

Nevertheless, I see these constraints at least, so that the correlation remains true and does not become false:
- as this data can be very volatile (i.e. large volumes of data), it is necessary to allow to manage some sort of maximum size of a FIFO stack of records on the server side and even to manage this maximum FIFO's refresh rate, in order not to see it sink (plateaus of 100% CPU) like other types of servers.
- the fetched data, on the client side, must be able to be updated (timer?) as much as we'd like i.e. basically, it must be possible to set their refresh rate (timer?) to 800..perhaps 500 ms. In short, to be able to do tests, to stop before seeing large plateaus of 100% CPU consumption appearing too, on this side.

By the way, technically, is it the client that does a data pull, or the server that does a broadcast \ notification push to the clients?
« Last Edit: September 21, 2021, 09:29:17 am by devEric69 »
use: Linux 64 bits (Ubuntu 20.04 LTS).
Lazarus version: 2.0.4 (svn revision: 62502M) compiled with fpc 3.0.4 - fpDebug \ Dwarf3.

kveroneau

  • Full Member
  • ***
  • Posts: 119
Re: Who'd be interested in network subscribed heap memory? Absurd I know...
« Reply #6 on: September 21, 2021, 06:23:19 pm »
I would see a correlation with some financial charting servers (the tools that draw things that move all the time). The latter - basically - retrieve (H,L,O,C) data from a subscribed exchange, with sometimes 2 or 3 other fields maximum. So, it's originally some very structured records that arrive in memory to a graphics server.

Afterwards, typically, these graphical servers allow calculations to be made (well, the ones I've studied a bit, have obviously an C-API, with a header file). This is where I would see an application to your server: to be able to distribute those raw data or even\better calculations from this raw data, distributed from a graphical server machine - which could then be called a front-end server, from the point of view of your server - towards other client machines on the network (hoping to be understood).

Thank you for your interest.  I can see this server being used for the purpose of syncing chart data between clients as a possible use-case.

Nevertheless, I see these constraints at least, so that the correlation remains true and does not become false:
- as this data can be very volatile (i.e. large volumes of data), it is necessary to allow to manage some sort of maximum size of a FIFO stack of records on the server side and even to manage this maximum FIFO's refresh rate, in order not to see it sink (plateaus of 100% CPU) like other types of servers.
- the fetched data, on the client side, must be able to be updated (timer?) as much as we'd like i.e. basically, it must be possible to set their refresh rate (timer?) to 800..perhaps 500 ms. In short, to be able to do tests, to stop before seeing large plateaus of 100% CPU consumption appearing too, on this side.

By the way, technically, is it the client that does a data pull, or the server that does a broadcast \ notification push to the clients?

Originally I was going to have the server push the updated data to each of the subscribed clients, but instead went with the simple notification push of a 32-byte packet.  The idea here was to allow the client to choose how to react to the event before deciding to pull the latest update or not.  For embedded devices, or slow network connections, it would be much more efficient than sending a huge potentially multi-kilobyte payload the client might just discard.  However, with that said.  There is no reason why I couldn't allow the client to choose which method the payload is delivered via additional options sent during the subscription phase.  The good part about this project right now, is that it can be extremely fluid as it's not currently being used anywhere so I don't have to worry about breaking existing clients or backwards compatibility during initial development.  And to be honest, I'd rather get mostly everything right the first time before making a formal release where people might potentially be using it.

I do plan on potentially releasing a Docker image with the server in the near future, alongside the required ObjectPascal unit file to interface with the server, and a desktop applications which will allow for the remote management of the server.  So look forward to that if you plan on testing it with your own concepts and ideas.  I honestly do want to build this server in the best way that I can so that it can be used in all sorts of different applications, and developer input is critical to accomplishing that goal.

damieiro

  • Full Member
  • ***
  • Posts: 200
Re: Who'd be interested in network subscribed heap memory? Absurd I know...
« Reply #7 on: September 22, 2021, 02:43:16 pm »
I'll bite. If you propose to serve an image of raw memory, how do you intend to describe what's in it particularly if you have an heterogeneous population of big- and little-endian systems?
Thank you for bringing that up to me actually, I don't know why, but I completely forgot about big and little endianess of processors.  This is a very valid issue which will make sharing raw memory structures more difficult among incompatible systems.  I'm glad you pointed this out now to me, or during my testing this would have really caused some interesting issues with data consistency.  I guess the initial version of this until a solid solution is developed will limit it's use of both the server/client to have the same processor endianness.

I like the idea.
About endiannes. It depends, but all intel/amd-64 arch is little endian.... And on the worst case it would be 2 kind of pages for a lot of people and can pre-made the most used (little endian). Yo have big endian on Unixes... but not on desktopa/laptop. Mobile devices accept little endian / big endian.

MarkMLl

  • Hero Member
  • *****
  • Posts: 6646
Re: Who'd be interested in network subscribed heap memory? Absurd I know...
« Reply #8 on: September 22, 2021, 03:20:54 pm »
About endiannes. It depends, but all intel/amd-64 arch is little endian.... And on the worst case it would be 2 kind of pages for a lot of people and can pre-made the most used (little endian). Yo have big endian on Unixes... but not on desktopa/laptop. Mobile devices accept little endian / big endian.

I've run plenty of big-endian desktop systems, including SPARC and MIPS. I've also run various word sizes, although nothing smaller than 32 bits using FPC or Linux/Solaris.

And I've seen plenty of comms protocols which change their payload format without its being identified by version or magic numbers.

I'm deeply troubled by the fragility of anything that publishes a block of memory without descriptive metadata.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

damieiro

  • Full Member
  • ***
  • Posts: 200
Re: Who'd be interested in network subscribed heap memory? Absurd I know...
« Reply #9 on: September 22, 2021, 03:49:12 pm »
About endiannes. It depends, but all intel/amd-64 arch is little endian.... And on the worst case it would be 2 kind of pages for a lot of people and can pre-made the most used (little endian). Yo have big endian on Unixes... but not on desktopa/laptop. Mobile devices accept little endian / big endian.

I've run plenty of big-endian desktop systems, including SPARC and MIPS. I've also run various word sizes, although nothing smaller than 32 bits using FPC or Linux/Solaris.

And I've seen plenty of comms protocols which change their payload format without its being identified by version or magic numbers.

I'm deeply troubled by the fragility of anything that publishes a block of memory without descriptive metadata.

MarkMLl

Yes, you're right, and i agree fully with "I'm deeply troubled by the fragility of anything that publishes a block of memory without descriptive metadata."
And yeah, comms protocols use that, but comm protocol doesnt's seem of use here, but the case-of-use (at least as i read it) it's for a server that can group similar machines, not  big unixes ones, nor their sparc terminals... and in the worse case, if you have 1000 petitions 950 little endian* and 49 big and 1 other (yes there are other formats) you can serve two versions of ram page  and would be ok too.
*:mobile devices support both and metadata is needed, as you say
a lot of GPU uses little endian
Really, the big metal hardware uses big endian and comms. Allthough there is no way to kill all big endian (and is argueable if it should be killed), i think it's not an issue that makes the idea fail. I think it could be done fairly well with proper management.

kveroneau

  • Full Member
  • ***
  • Posts: 119
Re: Who'd be interested in network subscribed heap memory? Absurd I know...
« Reply #10 on: September 22, 2021, 05:48:13 pm »
I'm deeply troubled by the fragility of anything that publishes a block of memory without descriptive metadata.

There is metadata associated with each block, currently there is a short title string, an application identifier, and a type identifier.  These identifiers are in byte format to eliminate any endian issues during a find operation.  The idea behind these 3 specific metadatas is based on how PalmOS and old school Mac stored application metadata in the file system, whereas at the time PCs used file extensions.

The Type identifier is application specific of course, can be searched using an API call FindType(), and I currently use it during testing to separate different record types being stored in the server.

Also, unlike the USB or other official specifications, the Application identifier is purely in the hands of the developer as there will most likely be no public servers on the Internet, giving developers enough combinations of AppIds and TypeIds in 2 distinct bytes.  Although, there is nothing saying that developers can't share apps or record types between them, if an app or record is shared, it should be noted in their documentation which AppIds and/or TypeIds are being used to avoid any clashes during a FindApp/FindType operation.

MarkMLl

  • Hero Member
  • *****
  • Posts: 6646
Re: Who'd be interested in network subscribed heap memory? Absurd I know...
« Reply #11 on: September 22, 2021, 07:38:41 pm »
There is metadata associated with each block, currently there is a short title string, an application identifier, and a type identifier.  These identifiers are in byte format to eliminate any endian issues during a find

If you use magic numbers of at least 16 bits, then you'll get a cheap indication of unexpected endianness.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

kveroneau

  • Full Member
  • ***
  • Posts: 119
Re: Who'd be interested in network subscribed heap memory? Absurd I know...
« Reply #12 on: October 06, 2021, 02:23:11 am »
Okay, so a beta version of this server has been uploaded to DockerHub.  I wouldn't recommend just grabbing it yet, as I haven't released the required client libraries to actually talk with the server, so it'd be a bit pointless to run other than to see some output and get an idea on how it works.

https://hub.docker.com/r/kveroneau/mcserver

I will post a link to a GitHub repo with both the client library in the form of a Pascal Unit file in complete source form, ready to be used and compiled into programs.  I made the deps very easy, basically all you'll need is FreePascal, with the standard OBJPAS runtime and the RTL ssockets unit, so it should be-able to slip into any program for any target that supports sockets.

This GitHub repo will also have a few examples, both command-line and GUI(made in Lazarus) which will be ready to compile.

A link to the repo will be posted both here in this forum, along with an addition to the Readme on the DockerHub page.

I'd also like to say how surprised I am about the size of the compiled server binary, it currently clocks in at 314Kb!  It can literally fit onto a floppy disk!  This is amazing when you look at the sizes of modern programs, or even game updates.

UPDATE:  Client library is ready, along with an example, more examples will follow soon: https://github.com/kveroneau/mcslib
« Last Edit: October 06, 2021, 04:56:14 am by kveroneau »

 

TinyPortal © 2005-2018