Let mORmot's applications be even more responsive

July 4, 2013, 10:48 am

≪ Previous: SynPDF now generates (much) smaller PDF file size

In mORmot applications, all the client communication is executed by default in the current thread, i.e. the main thread for a typical GUI application.
This may become an issue in some reported environments.

Since all communication is performed in blocking mode, if the remote request takes long to process (due to a bad/slow network, or a long server-side action), the application may become unresponsive, from the end-user experience.
Even Windows may be complaining about a "non responsive application", and may propose to kill the process, which is far away from an expected behavior.

In order to properly interacts with the user, a OnIdle property has been defined in TSQLRestClientURI, and will change the way communication is handled.
If a callback event is defined, all client communication will be processed in a background thread, and the current thread (probably the main UI thread) will wait for the request to be performed in the background, running the OnIdle callback in loop in the while.

You can find in the mORMotUILogin unit two methods matching this callback signature:

  TLoginForm = class(TForm)
  (...)
    class procedure OnIdleProcess(Sender: TSQLRestClientURI; ElapsedMS: Integer);
    class procedure OnIdleProcessForm(Sender: TSQLRestClientURI; ElapsedMS: Integer);
  end;

The first OnIdleProcess() callback will change the mouse cursor shape to crHourClass after a defined period of time.
The OnIdleProcessForm() callback will display a pop-up window with a 'Please wait...' message, if the request takes even more time. Both will call Application.ProcessMessages to ensure the application User Interface is still responsive.

Some global variable were also defined to tune the behavior of those two callbacks:

var/// define when TLoginForm.OnIdleProcess() has to display the crHourGlass cursor// after a given time elapsed, in milliseconds// - default is 100 ms
  OnIdleProcessCursorChangeTimeout: integer = 100;

/// define when TLoginForm.OnIdleProcessForm() has to display the temporary// form after a given time elapsed, in milliseconds// - default is 2000 ms, i.e. 2 seconds
  OnIdleProcessTemporaryFormTimeout: integer = 2000;

/// define the message text displayed by TLoginForm.OnIdleProcessForm() // - default is sOnIdleProcessFormMessage resourcestring, i.e. 'Please wait...'
  OnIdleProcessTemporaryFormMessage: string;

You can therefore change those settings to customize the user experience. We tested it with a 3 second artificial temporizer for each request, and the applications were running smoothly, even if slowly - but comparable to most web applications, in fact. The SynFile main demo (available in the SQlite3MainDemo folder defines such a callback.

Note that this OnIdle feature is defined at TSQLRestClientURI class level, so is available for all communication protocols, not only HTTP but named pipes or in-process, so could be used to enhance user experience in case of some time consuming process.

Feedback is welcome on our forum, as usual.

↧

Tempering Garbage Collection

July 24, 2013, 2:01 am

≫ Next: Summer videos of mORMot

≪ Previous: Let mORmot's applications be even more responsive

I'm currently fighting against out of memory errors on an heavy-loaded Java server.

If only it had been implemented in Delphi and mORMot!
But at this time, the mORMot was still in its burrow.
Copy-On-Write and a good heap manager can do wonders of stability.

Here are some thoughts about Garbage Collector, and how to temper their limitations.
They may apply to both the JVM and the .Net runtime, by the way.

Some general patterns about Garbage Collection (GC):

It is almost impossible to know how much memory is used by a memory structure at runtime, since the corresponding objects may not have been marked as deprecated so are still in memory even if they are not in the queue any more;
Direct use of object references is handled by some internal reference-counting mechanism, until you define some circular references. Sadly, most GC’s algorithms are much more complex than a simple reference counting mechanism: since a GC favors allocation speed, its tendency is to allocate as many objects as possible, only re-using and collecting the objects as late as possible.
You can force the GC to collect the memory, but it is usually a blocking process (so may be a wrong idea on a real-time service);
And since the GC has not a deterministic behavior, you can not be sure which threshold value of your heap use may be a good trigger of garbage collection;
Some authors state that most GC algorithms expects from 3 to 5 times the used memory to be available (i.e. if you expect 200 MB of data, you need 800 MB of free RAM for your process) - this is mostly due to the performance optimization ;
On the other hand, giving too much memory may do the opposite as expected, i.e. reduce the global performance, depending on how the VM works;
From my experiments, the .Net memory model seems to be more aggressive than the one in Java, especially in multi-thread process.

Some usual fixes/optimizations paths:

Re-use existing objects, and not create new instances (using object pools);
Use arrays of pre-allocated objects, and restraint use to POJOs/POCOs;
Some memory structures may use less memory and overhead (e.g. an array of struct in C# are much faster and uses much less memory than a list of objects);
Limit objects cloning/marshaling/wrapping as much as possible, and pass the data as reference;
Pre-allocate and re-use memory e.g. for storing text (typical efficient pattern is the string builder);
Multi-thread process (object locking and monitoring) consumes a lot of resources, so instead of locking at object level, mutexes on small part of the code are much more efficient;
Do not create more threads than the number of CPU cores it run on – in general, one optimized thread is more efficient than multiple threads: process should happen in one non-blocking thread, then other threads are used to pre-process or post-process the data, e.g. when something slow may take place like serialization or network access;
Profile the execution, then identify the real bottlenecks to be optimized – for instance working with individual small files is an awful practice;
Use a fast un-managed in-process storage (e.g. SQLite3, BerkeleyDB, memcached…) instead of storing long-term objects in GC memory.

For server process, or mobile execution, unmanaged environments like Delphi are still a perfect fit!

↧

Summer videos of mORMot

September 1, 2013, 10:54 pm

≫ Next: HTTPS communication in mORMot

≪ Previous: Tempering Garbage Collection

During this summer, warleyalex did meet some mORMots in the mountains of REST, Java, AJAX and JSON.

(picture may differ from actual user:) )

He did some videos of his experiment with our little rodent.
At this time, there are 11 videos available!

Latest one is a Java client application, communicating with a mORMot server and a SQLite3 database.

Please go to this forum post to find all URIs, and put your feedback.

Thanks a lot for sharing!

↧

HTTPS communication in mORMot

September 4, 2013, 3:05 am

≫ Next: Thread-safety of mORMot

≪ Previous: Summer videos of mORMot

In mORMot, the http.syskernel mode server can be defined to serve HTTPS secure content.

Yes, mORMots do like sophistication:

When the aUseSSL boolean parameter is set for TSQLHttpServer.Create() constructor, the SSL layer will be enabled within http.sys.
Note that useHttpSocket kind of server does not offer SSL encryption yet.

We will now define the steps needed to set up a HTTPS server in mORMot.

In order to let the SSL layer work as expected, you need first to create and import a set of certificates.
Here are the needed steps, as detailed in http://www.codeproject.com/Articles/24027/SSL-with-Self-hosted-WCF-Service and http://msdn.microsoft.com/en-us/library/ms733791.aspx - you can refer to any WCF related documentation about HTTPS, since it shares the http.sys kernel-mode server with mORMot and IIS.

Certificates

You need one certificate (cert) to act as your root authority, and one to act as the actual certificate to be used for the SSL, which needs to be signed by your root authority. If you don't set up the root authority your single certificate won't be trusted, and you will start to discover this through a series of extremely annoying exceptions, long after the fact.

The following command (run in a Visual Studio command prompt) will create your root certificate:

makecert -sv SignRoot.pvk -cy authority -r signroot.cer -a
sha1 -n "CN=Dev Certification Authority" -ss my -sr localmachine

Take a look at the above links to see what each of these arguments mean, it isn't terribly important, but it's nice to know.

The MakeCert tool is available as part of the Windows SDK, which you can download from http://go.microsoft.com/fwlink/p/?linkid=84091 if you do not want to download the whole Visual Studio package. Membership in Administrators, or equivalent, on the local computer is the minimum required to complete this procedure.

Once this command has been run and succeeded, you need to make this certificate a trusted authority. You do this by using the MMC snap in console. Go to the run window and type "mmc", hit enter. Then in the window that opens (called the "Microsoft Management Console", for those who care) perform the following actions:

File -> Add/Remove Snap-in -> Add… -> Double click Certificates -> Select Computer Account and Click Next -> Finish -> Close -> OK

Then select the Certificates (Local Computer) -> Personal -> Certificates node.

You should see a certificate called "Dev Certificate Authority" (or whatever else you decided to call it as parameter in the above command line). Move this certificate from the current node to Certificates (Local Computer) -> Trusted Root Certification Authorities -> Certificates node, drag and drop works happily.

Now you have NOT the cert you need
You have made yourself able to create trusted certs though, which is nice.
Now you have to create another cert, which you are actually going to use.

Run makecert again, but run it as follows...

makecert -iv SignRoot.pvk -ic signroot.cer -cy end -pe -n
CN="localhost" -eku 1.3.6.1.5.5.7.3.1 -ss my -sr
localmachine -sky exchange -sp
"Microsoft RSA SChannel Cryptographic Provider" -sy 12

Note that you are using the first certificate as the author for this latest one. This is important... where I have localhost you need to put the DNS name of your box. In other words, if you deploy your service such that its endpoint reads http://bob:10010/Service then the name needs to be bob. In addition, you are going to need to do this for each host you need to run as (yes, so one for bob and another one for localhost).

Get the signature of your cert by double clicking on the cert (Select the Certificates (Local Computer) ' Personal ' Certificates), opening the details tab, and scrolling down to the "Thumbprint" option.

Select the thumbprint and copy it. Put it in Notepad or any other text editor and replace the spaces with nothing. Keep this thumbprint heaxdecimal value safe, since we will need it soon.

You have your certs set up. Congrats!
But we are not finished yet.

Configure a Port with an SSL certificate

Now you get to use another fun tool, httpcfg (for XP/2003), or its newer version, named aka netsh http (for Vista/Seven/Eight).

Firstly run the command below to check that you don't have anything running on a port you want.

httpcfg query ssl

(under XP)

netsh http show sslcert

(under Vista/Seven/Eight)

If this is your first time doing this, it should just return a newline. If there is already SSL set up on the exact IP you want to use (or if later on you need to delete any mistakes) you can use the following command, where the IP and the port are displayed as a result from the previous query.

Now we have to bind an SSL certificate to a port number, as such (here below, 0000000000003ed9cd0c315bbb6dc1c08da5e6 is the thumbprint of the certificate, as you copied it into the notepad in the previous paragraph):

httpcfg set ssl -i 0.0.0.0:8012 -h 0000000000003ed9cd0c315bbb6dc1c08da5e6

(under XP)

netsh http add sslcert ipport=0.0.0.0:8000 certhash=0000000000003ed9cd0c315bbb6dc1c08da5e6 appid={00112233-4455-6677-8899-AABBCCDDEEFF}

(under Vista/Seven/Eight)
Here the appid= parameter is a GUID that can be used to identify the owning application.

To delete an SSL certificate from a port number previously registered, you can use one of the following commands:

httpcfg delete ssl -i 0.0.0.0:8005 -h 0000000000003ed9cd0c315bbb6dc1c08da5e6
httpcfg delete ssl -i 0.0.0.0:8005

(under XP)

Netsh http delete sslcert ipport=0.0.0.0:8005

(under Vista/Seven/Eight)

Note that this is mandatory to first delete an existing certificate for a given port before replacing it with a new one.

Feedback and information is welcome on our forum, as usual.

↧

Thread-safety of mORMot

September 10, 2013, 10:57 am

≫ Next: FreePascal Lazarus and Android Native Controls

≪ Previous: HTTPS communication in mORMot

We tried to make mORMot at the same time fast and safe, and able to scale with the best possible performance on the hardware it runs on.
Multi-threading is the key to better usage of modern multi-core CPUs, and also client responsiveness.

As a result, on the Server side, our framework was designed to be thread-safe.

On typical production use, the mORMotHTTP server will run on its own optimized thread pool, then call the TSQLRestServer.URI method. This method is therefore expected to be thread-safe, e.g. from the TSQLHttpServer. Request method. Thanks to the RESTful approach of our framework, this method is the only one which is expected to be thread-safe, since it is the single entry point of the whole server. This KISS design ensure better test coverage.

Let us see now how this works, and publish some benchmarks to test how efficient it has been implemented.

By design

In order to achieve this thread-safety without sacrificing performance, the following rules were applied in TSQLRestServer.URI:

Most of this methods's logic is to process the incoming parameters, so is thread-safe by design (e.g. Model and RecordProps access do not change during process);
The SQLite3 engine access is protected at SQL/JSON cache level, via DB.LockJSON() calls in TSQLRestServerDB methods;
TSQLRestServerStatic main methods (EngineList, EngineRetrieve, EngineAdd, EngineUpdate, EngineDelete, EngineRetrieveBlob, EngineUpdateBlob) are thread-safe: e.g. TSQLRestServerStaticInMemory uses a per-Table Critical Section;
TSQLRestServerCallBack methods (i.e. published methods of the inherited TSQLRestServer class) must be implemented to be thread-safe;
Interface-based services have several execution modes, including thread safe automated options (see TServiceMethodOption), or manual thread safety expectation, for better scaling;
A protected fSessionCriticalSection is used to protect shared fSession[] access between clients;
Remote external tables use thread-safe connections and statements when accessing the databases via SQL;
Access to fStats was not made thread-safe, since this data is indicative only: a mutex was not used to protect this resource.

We tried to make the internal Critical Sections as short as possible, or relative to a table only (e.g. for TSQLRestServerStaticInMemory).

There is some kind of "giant lock" at the SQLite3 engine level, so all requests process will be queued.
This was not found to be a major issue (see benchmark results below), since the internal SQL/JSON cache implementation need such a global lock, and since most of the SQLite3 resource use will consist in hard disk access, which gain to be queued.
It also allows to use the SQLite3 in lmExclusive locking mode if needed, with both benefits of high performance and multi-thread friendliness.

From the Client-side, the REST core of the framework is expected to be Client-safe by design, therefore perfectly thread-safe: it's the benefit of the stateless architecture.

By proof

When we are talking about thread-safety, nothing compares to a dedicated stress test program.
An average human brain (like ours) is not good enough to ensure proper design of such a complex process.
So we have to prove the abilities of our little mORMot.

In the supplied regression tests, we designed a whole class of multi-thread testing, named TTestMultiThreadProcess.
Its methods will run every and each Client-Server protocols available (direct access via TSQLRestServerDB or TSQLRestCLientDB, GDI messages, named pipes, and both HTTP servers - i.e. http.sys based or WinSock-based).

Each protocol will execute in parallel a list of INSERTs - i.e. TSQLRest.Add() - followed by a list of SELECTs - i.e. TSQLRest.Retrieve().
Those requests will be performed in 1 thread, then 2, 5, 10, 30 and 50 concurrent threads.
The very same SQLite3 database (in lmExclusive locking mode) is accessed at once by all those clients.
Then the IDs generated by each thread are compared together, to ensure no cross-insertion did occur during the process.

Those automated tests did already reveal some issues in the initial implementation of the framework. We fixed any encountered problems, as soon as possible.
Feel free to send us any feedback, with code to reproduce the issue: but do not forget that multi-threading is also difficult to test - problems may occur not in the framework, but in the testing code itself!

When setting OperationCount to 1000 instead of the default 200, i.e. running 1000 INSERTions and 1000 SELECTs in concurrent threads, the numbers are the following, on the local machine (compiled with Delphi XE4):

 Multi thread process:
- Create thread pool: 1 assertion passed  3.11ms
- TSQLRestServerDB: 24,061 assertions passed  903.31ms
1=41986/s  2=24466/s  5=14041/s  10=9212/s  30=10376/s  50=10028/s
- TSQLRestClientDB: 24,062 assertions passed  374.93ms
1=38606/s  2=35823/s  5=30083/s  10=32739/s  30=33454/s  50=30905/s
- TSQLRestClientURINamedPipe: 12,012 assertions passed  1.68s
1=4562/s  2=5002/s  5=3177/s
- TSQLRestClientURIMessage: 16,022 assertions passed  616.00ms
1=16129/s  2=24873/s  5=8613/s  10=11857/s
- TSQLHttpClientWinHTTP_HTTPAPI: 24,056 assertions passed  1.63s
1=5352/s  2=7441/s  5=7563/s  10=7903/s  30=8413/s  50=9106/s
- TSQLHttpClientWinSock_WinSock: 24,061 assertions passed  1.10s
1=11528/s  2=10941/s  5=12014/s  10=12039/s  30=9443/s  50=10831/s
Total failed: 0 / 124,275  - Multi thread process PASSED  6.31s

For direct in-process access, TSQLRestClientDB sounds the best candidate: its abstraction layer is very thin, and much more multi-thread friendly than straight TSQLRestServerDB calls.
It also will feature a cache, on need.
And it will allow your code to switch between TSQLRestClientURI kind of classes, from its shared abstract methods.

Named pipes and GDI messages are a bit constrained in highly parallel mode, but HTTP does pretty good.
The server based on http.sys (HTTP API) is even impressive: the more clients, the more responsive it is.
It is known to scale much better than the WinSock-based class supplied, which shines with one unique local client (i.e. in the context of those in-process regression tests), but sounds less reliable on production.

Check yourself before you wreck yourself

In addition, you can make yourself an idea, and run the "21 - HTTP Client-Server performance" sample programs, locally or over a network, to check the mORMot abilities to scale and serve a lot of clients with as few resources as possible.

Compile both client and server projects, then launch Project21HttpServer.exe.
The server side will execute as a console window.

This Server will define the same TSQLRecordPeople as used during our multi-thread regression tests, that is:

type
  TSQLRecordPeople = class(TSQLRecord)
  private
    fFirstName: RawUTF8;
    fLastName: RawUTF8;
    fYearOfBirth: integer;
    fYearOfDeath: word;
  publishedproperty FirstName: RawUTF8 read fFirstName write fFirstName;
    property LastName: RawUTF8 read fLastName write fLastName;
    property YearOfBirth: integer read fYearOfBirth write fYearOfBirth;
    property YearOfDeath: word read fYearOfDeath write fYearOfDeath;
  end;

The server main block is just the following:

aModel := TSQLModel.Create([TSQLRecordPeople]);try
    aDatabaseFile := ChangeFileExt(paramstr(0),'.db3');
    DeleteFile(aDatabaseFile);
    aServer := TSQLRestServerDB.Create(aModel,aDatabaseFile);tryaServer.DB.Synchronous := smOff;aServer.DB.LockingMode := lmExclusive;
      aServer.NoAJAXJSON := true;
      aServer.CreateMissingTables;
      // launch the serveraHTTPServer := TSQLHttpServer.Create('888',[aServer]);try
        writeln(#13#10'Background server is running at http://localhost:888'#13#10+
                #13#10'Press [Enter] to close the server.');
        ConsoleWaitForEnterKey;
        with TSQLLog.Family doif not (sllInfo in Level) then// let global server stats be logged
            Level := Level+[sllInfo];
      finally
        aHTTPServer.Free;
      end;
    finally
      aServer.Free;
    end;
  finally
    aModel.Free;
  end;

It will give CRUD access to the TSQLRecordPeople table, from HTTP.
We defined Synchronous := smOff and LockingMode := lmExclusive to have the best performance possible.
Our purpose here is not to have true ACID behavior, but test concurrent remote access.

The Client is just a RAD form which will execute the very same code than during the regression tests, i.e. a TTestMultiThreadProcess class instance, as shown by the following code:

    Tests := TSynTestsLogged.Create;
    Test := TTestMultiThreadProcess.Create(Tests);tryTest.ClientOnlyServerIP := StringToAnsi7(lbledtServerAddress.Text);Test.MinThreads := ThreadCount;Test.MaxThreads := ThreadCount;Test.OperationCount := OperationCount;Test.ClientPerThread := ClientPerThread;Test.CreateThreadPool;
      txt := Format
        ('%s'#13#10#13#10'Test started with %d threads, %d client(s) per thread and %d rows to be inserted...',
        [txt,ThreadCount,ClientPerThread,OperationCount]);
      mmoInfo.Text := txt;
      Timer.Start;
      Test._TSQLHttpClientWinHTTP_HTTPAPI;
      txt := mmoInfo.Text+Format(#13#10'Assertion(s) failed: %d / %d'+
        #13#10'Number of clients connected at once: %d'+
        #13#10'Time to process: %s'#13#10'Operation per second: %d',
        [Test.AssertionsFailed,Test.Assertions,
         ThreadCount*ClientPerThread,Timer.Stop,Timer.PerSec(OperationCount*2)]);
      mmoInfo.Text := txt;
    finally
      Test.Free;
      Tests.Free;
    end;

Each thread of the thread pool will create its own HTTP connection, then loop to insert (Add ORM method) and retrieve (Retrieve ORM method) a fixed number of objects - checking that the retrieved object fields match the inserted values. Then all generated IDs of all threads are checked for consistency, to ensure no race condition did occur.

The input parameters are therefore the following:

Remote HTTP server IP (port is 888);
Number of client threads;
Number of client instances per thread;
Number of TSQLRecordPeople objects added.

When running over the following hardware configuration:

Server is a Core i7 Notebook, with SSD, under Windows 7;
Client is a Core 2 Duo Workstation, with regular hard-drive (not used), under Windows 7;
Communicating over a somewhat slow 100 Mb network with a low priced Ethernet HUB.

Typical results are the following:

Threads	Clients/ thread	Rows inserted	Total Clients	Time (sec)	Op/sec
1	1	10000	1	15.78	1267
50	1	10000	50	2.96	6737
100	1	10000	100	3.09	6462
100	1	20000	100	6.19	6459
50	2	100000	100	34.99	5714
100	2	100000	200	36.56	5469
500	100	100000	50000	92.92	2152

During all tests, no assertion failed, meaning that no concurrency problem did occur, nor any remote command lost.
It is worth noting that when run several times in a row, the same set of input parameters give the very same speed results: it indicates that the architecture is pretty stable and could be considered as safe.
The system is even able to serve 50000 connected clients at once, with no data loss - in this case, performance is lower (2152 insert/second in the above table), but we clearly reached the CPU and network limit of our client hardware configuration; in the meanwhile, server resources on the Notebook have still some potential.

Average performance is pretty good, even more if we consider that we are inserting one object per request, with no transaction.
In fact, it sounds like if our little SQLite3 server is faster than most database servers, even when accessed in highly concurrent mode! In batch mode we may achieve amazing results.

Feel free to send your own benchmark results and feedback, e.g. with concurrent clients on several workstations, or long-running tests, on our forums.

↧

FreePascal Lazarus and Android Native Controls

September 19, 2013, 2:41 am

≫ Next: Good old object is not to be deprecated - it is the future

≪ Previous: Thread-safety of mORMot

We all know that the first Delphi for Android was just released...

I just found out an amazing alternative, using native Android controls, and FPC/Lazarus as compiler and IDE.

It creates small .apk file: only 180 KB, from my tests!

It makes use of direct LCL access of Android native controls, so it is a great sample.

What I like very much is the following:

It uses native UI controls, so you do not suffer from FireMonkey restrictions about Unicode/RTL languages and such, and have a native look and fill;
It consists in a set of Java classes, used as glue to the Android platform, accessed via JNI from a FPC-compiled library;
It creates very small application files (great for downloading) - the FPC compiler generates a .so binary library which compresses to a bit more than 100 KB;
There is some nice low-level units in JNI for bridging the object pascal code to Java;
Most of the logic is written in Java itself, so you do not have to fight with event or low-level Java structures translations in object pascal code;
Framework code is still readable and does not suffer from multi-platform targeting;
You can re-use you own existing object pascal code, with no problem of restrictions/regressions due to Delphi NextGen compiler;
As a result, you are encouraged to separate your business logic from your UI code - which is a good idea;
Sounds to be stable in practice, even in early stage - its "glued" design sounds easier to debug than the huge FireMonkey, if you are able to read some limited Java code;
You have a sibling Native iOS Controls project using FPC available to target also iPhone/iPad devices;
It is Open and expandable, so you are able to fork the project if needed.

Drawback is that:

It is more like a proof-of-concept than a whole framework, in its current stage;
It is not well known nor supported by a big company (worth a new tag in Stack Overflow?);
It is free so you won't give away your money to Embarcadero.

It is definitively one step forward for pushing us in direction to FPC full support for mORMot!

↧

Good old object is not to be deprecated - it is the future

October 8, 2013, 11:07 pm

≫ Next: DataSnap-like Client-Server JSON RESTful Services in Delphi 6-XE5

≪ Previous: FreePascal Lazarus and Android Native Controls

Yes, I know this article title is a huge moment of trolling for most Delphi developer.
But object could be legend... - wait for it - ... dary!

You perhaps already noticed by several blog posts here that I still like the good old (and deprecated) object type, in addition to the common heap-allocated class type.
Plain record with methods does not match the object-oriented approach of object, since it does not feature inheritance.

When you take a look at modern strongly-typed languages, targeting concurrent programming (you know, multi-thread/multi-core execution), you will see that the objects may be allocated in several ways, to facilitate execution flow.

The Rust language for instance is pretty interesting. It has optional task-local Garbage Collection and safe pointer types with region analysis.

To some extent, it is very similar to what object allows in the Delphi world, and why I'm still using/loving it!

Memory models

You have indeed several memory models around:

Manual memory handling
C and C++ provide very fine-grained control over memory allocation; heap memory can be explicitly allocated and de-allocated.
Full garbage collection
The majority of modern languages expose a memory model that uses the heap for everything, ranging from Java to Go to JavaScript to Python to Ruby to Haskell.
Garbage collection with value types and references
C# for instance is essentially garbage-collected, but features value types which are guaranteed to be stack-allocated if in local variables.
Manual memory handling with reference-counted types
This is where Delphi shines with its Copy-On-Write (COW) types and reference-counted interface, and also Objective C with its ARC model (which is an more sophisticated approach to reference counting).
Safe manual memory management
Rust falls into this category: you can choose where your object will be allocated (heap or stack), and how tasks work with it (it minimizes sharing by default, in contrast to other models).

As you can see, Rust is pretty unique in this panel.
But we will see that Delphi is not far away from it.

Rust memory model

Rust allocates objects in a task-oriented manner (extracted from their language definition):

Rust has a memory model centered around concurrently-executing tasks.
Thus its memory model and its concurrency model are best discussed simultaneously, as parts of each only make sense when considered from the perspective of the other.
When reading about the memory model, keep in mind that it is partitioned in order to support tasks; and when reading about tasks, keep in mind that their isolation and communication mechanisms are only possible due to the ownership and lifetime semantics of the memory model.

So the memory model is the following:

A Rust program's memory consists of a static set of items, a set of tasks each with its own stack, and a heap.
Immutable portions of the heap may be shared between tasks, mutable portions may not.

The immutability of memory portions does make sense, but I tend to like the Delphi COW approach very much.
For instance, local variables in Rust are immutable unless declared with let mut.

Rust features a task-oriented memory model, which purpose is to avoid full Garbage-Collection memory model.
The basic idea is to remove garbage collection from the core language and relegate it to the standard library, with a minimal set of language hooks in place to allow for flexible, pluggable automatic memory management.

As a consequence, you can allocate your objects from the stack or the heap - this is a difficult concept to grasp for many programmers used to languages like Java or C# that don’t make such a distinction.

In Rust, a box is a reference to a heap allocation holding another value.

There are two kinds of boxes: managed boxes and owned boxes.

A managed box type or value is constructed by the prefix @.
An owned box type or value is constructed by the prefix ~.

In short:

~ is just the Rust equivalent of the C malloc and free functions;
@ is a replacement for manual reference counting in C programs (i.e. when your value need to be managed outside the immediate execution scope);
& will define a "borrowed pointer", i.e. a reference to the object.

As a consequence, Rust is able to use the stack or a task-specific heap to handle owned box values.
It results in a much better performance scalability, when used in concurrent mode.
For a more complete introduction to Rust's memory model, see this great article.

On Delphi side

Not object instead of class - but object in conjunction with class/interface for most high-level objects.

There are a lot of opportunities when it does make sense to have:

Individual objects allocated locally on stack within a method/function scope;
Set of objects allocated at once on a [dynamic] array (local or shared);
Map some binary content with strongly-typed pointers and object methods.

You can do this with plain record, but object allow inheritance, which is IMHO mandatory to follow proper OOP design - the Single Responsibility principle, for instance.
In Domain-Driven Design, objects are great to define value objects.

With strong-typed pointers, your code can still be safe to work with (it won't suffer from C weakness of pointer typing).

In my opinion, the Delphi language, in its current state, features a large panel of
Reducing the Delphi language to just one string type or just one memory model (like with the Delphi NextGen compiler) is not a good idea.
It sounds like a lack of vision to me.

Some years ago, we were told by EMB that Garbage Collection was the future.
Now - thanks to the Apple hipe - the ARC model is the new model to follow.
When you take a wide look, e.g. when you look at Rust, you can see that a less monolithic approach (like the one existing on OldGen Delphi) could make sense!

From my understanding, the object pascal memory model (with dedicated class and object types) may be more easy to work with than the Rust @ ~ obfuscated syntax.
And you are not stuck by hard coded principles like default immutability of tasks: in Delphi, you can still use a global heap, and rely on dedicated structures when you want better performance.
This is what we tried to do with mORMot: high-level methods are easy to work with, but its internal core uses low-level tricks (like pointer arithmetic or stack-allocated structures) to ensure the best possible scalability.

In a practical point of view, Object Pascal's Manual memory handling with reference-counted types does still make sense.
In addition to the initial COW paradigm, the RCU (Read-Copy-Update) pattern could make sense: when used in multi-thread context, it allows lock-free shared access to resources (which is not allowed by Rust).

The proposal of a threadlocalvar compiler enhancement - in this very same blog, back in 2010 - still does make sense, and perfectly fit with the memory model proposed by Rust.
A custom mode of class allocation could make sense also, by overriding the low-level allocation/release model of TObject to allocate on the heap by default, but on the stack (with auto-release at the end of scope) with an optional pattern.

It is a pity that the object type is just broken since Delphi 2010.
Feel free to give your feedback on our forum, as usual.

↧

DataSnap-like Client-Server JSON RESTful Services in Delphi 6-XE5

October 15, 2013, 11:02 am

≫ Next: Updated mORMot database benchmark - including MS SQL

≪ Previous: Good old object is not to be deprecated - it is the future

Article update:
The server side call back signature changed since this article was first published in 2010.
Please refer to the documentation or this forum article and associated commit.
The article was totally rewritten to reflect the enhancements.
And do not forget to see mORMot's interface-based services!

Note that the main difference with previous implementation is the signature of the service implementation event, which should be now exactly:
procedure MyService(Ctxt: TSQLRestServerURIContext);
(note that there is one unique class parameter, with no var specifier)
Please update your code if you are using method-based services!

You certainly knows about the new DataSnap Client-Server features, based on JSON, introduced in Delphi 2010.
http://docwiki.embarcadero.com/RADStudi … plications

We added such communication in our mORmot Framework, in a KISS (i.e. simple) way: no expert, no new unit or new class. Just add a published method Server-side, then use easy functions about JSON or URL-parameters to get the request encoded and decoded as expected, on Client-side.

To implement a service in the Synopse mORMot framework, the first method is to define published method Server-side, then use easy functions about JSON or URL-parameters to get the request encoded and decoded as expected, on Client-side.

We'll implement the same example as in the official Embarcadero docwiki page above.
Add two numbers.
Very useful service, isn't it?

Publishing a service on the server

On the server side, we need to customize the standard TSQLRestServer class definition (more precisely a TSQLRestServerDB class which includes a SQlite3 engine, or a lighter TSQLRestServerFullMemory kind of server, which is enough for our purpose), by adding a new published method:

type
  TSQLRestServerTest = class(TSQLRestServerFullMemory)
   (...)
  publishedprocedure Sum(Ctxt: TSQLRestServerURIContext);end;

The method name ("Sum") will be used for the URI encoding, and will be called remotely from ModelRoot/Sum URL.
The ModelRoot is the one defined in the Root parameter of the model used by the application.

This method, like all Server-side methods, MUST have the same exact parameter definition as in the TSQLRestServerCallBack prototype, i.e. only one Ctxt parameter, which refers to the whole execution context:

type
  TSQLRestServerCallBack = procedure(Ctxt: TSQLRestServerURIContext) of object;

Then we implement this method:

procedure TSQLRestServerTest.Sum(Ctxt: TSQLRestServerURIContext);
beginwith Ctxt do
    Results([Input['a']+Input['b']]);
end;

The Ctxt variable publish some properties named InputInt[] InputDouble[] InputUTF8[] and Input[] able to retrieve directly a parameter value from its name, respectively as Integer/Int64, double, RawUTF8 or variant.

Therefore, the code above using Input[] will introduce a conversion via a variant, which may be a bit slower, and in case of string content, may loose some content for older non Unicode versions of Delphi.
So it is a good idea to use the exact expected Input*[] property corresponding to your value type. It does make sense even more when handling text, i.e. InputUTF8[] is to be used in such case. For our floating-point computation method, we may have coded it as such:

procedure TSQLRestServerTest.Sum(Ctxt: TSQLRestServerURIContext);
beginwith Ctxt do
    Results([InputDouble['a']+InputDouble['b']]);
end;

The Ctxt.Results([]) method is used to return the service value as one JSON object with one "Result" member, with default MIME-type JSON_CONTENT_TYPE.

For instance, the following request URI:

 GET /root/Sum?a=3.12&b=4.2

will let our server method return the following JSON object:

 {"Result":7.32}

That is, a perfectly AJAX-friendly request.

Note that all parameters are expected to be plain case-insensitive 'A'..'Z','0'..'9' characters.

An important point is to remember that the implementation of the callback method must be thread-safe.
In fact, the TSQLRestServer.URI method expects such callbacks to handle the thread-safety on their side.
It's perhaps some more work to handle a critical section in the implementation, but, in practice, it's the best way to achieve performance and scalability: the resource locking can be made at the tiniest code level.

Defining the client

The client-side is implemented by calling some dedicated methods, and providing the service name ('sum') and its associated parameters:

function Sum(aClient: TSQLRestClientURI; a, b: double): double;
var err: integer;
begin
  val(aClient.CallBackGetResult('sum',['a',a,'b',b]),Result,err);
end;

You could even implement this method in a dedicated client method - which make sense:

type
  TMyClient = class(TSQLHttpClient) // could be TSQLRestClientURINamedPipe
  (...)
    function Sum(a, b: double): double;
  (...)

function TMyClient.Sum(a, b: double): double;
var err: integer;
begin
  val(CallBackGetResult('sum',['a',a,'b',b]),Result,err);
end;

This later implementation is to be preferred on real applications.

You have to create the server instance, and the corresponding TSQLRestClientURI (or TMyClient), with the same database model, just as usual...

On the Client side, you can use the CallBackGetResult method to call the service from its name and its expected parameters, or create your own caller using the UrlEncode() function.
Note that you can specify most class instance into its JSON representation by using some TObject into the method arguments:

function TMyClient.SumMyObject(a, b: TMyObject): double;
var err: integer;
begin
  val(CallBackGetResult('summyobject',['a',a,'b',b]),Result,err);
end;

This Client-Server protocol uses JSON here, as encoded server-side via Ctxt.Results() method, but you can serve any kind of data, binary, HTML, whatever... just by overriding the content type on the server with Ctxt.Returns().

Direct parameter marshalling on server side

We have used above the Ctxt.Input*[] properties to retrieve the input parameters.
This is pretty easy to use and powerful, but the supplied Ctxt gives full access to the input and output context.

Here is how we may implement the fastest possible parameters parsing:

procedure TSQLRestServerTest.Sum(Ctxt: TSQLRestServerURIContext);
var a,b: Extended;
  if UrlDecodeNeedParameters(Ctxt.Parameters,'A,B') then beginwhile Ctxt.Parameters<>nil do begin
      UrlDecodeExtended(Ctxt.Parameters,'A=',a);
      UrlDecodeExtended(Ctxt.Parameters,'B=',b,@Ctxt.Parameters);
    end;
    Ctxt.Results([a+b]);
  end else
    Ctxt.Error('Missing Parameter');
end;

The only not obvious part of this code is the parameters marshaling, i.e. how the values are retrieved from the incoming Ctxt.Parameters text buffer, then converted into native local variables.

On the Server side, typical implementation steps are therefore:

Use the UrlDecodeNeedParameters function to check that all expected parameters were supplied by the caller in Ctxt.Parameters;
Call UrlDecodeInteger / UrlDecodeInt64 / UrlDecodeExtended / UrlDecodeValue / UrlDecodeObject functions (all defined in SynCommons.pas) to retrieve each individual parameter from standard JSON content;
Implement the service (here it is just the a+b expression);
Then return the result calling Ctxt.Results() method or Ctxt.Error() in case of any error.

The powerful UrlDecodeObject function (defined in mORMot.pas) can be used to un-serialize most class instance from its textual JSON representation (TPersistent, TSQLRecord, TStringList...).

Using Ctxt.Results() will encode the specified values as a JSON object with one "Result" member, with default mime-type JSON_CONTENT_TYPE:

 {"Result":"OneValue"}

or a JSON object containing an array:

 {"Result":["One","two"]}

Returns non-JSON content

Using Ctxt.Returns() will let the method return the content in any format, e.g. as a JSON object (via the overloaded Ctxt.Returns([]) method expecting field name/value pairs), or any content, since the returned MIME-type can be defined as a parameter to Ctxt.Returns() - it may be useful to specify another mime-type than the default constant JSON_CONTENT_TYPE, i.e. 'application/json; charset=UTF-8', and returns plain text, HTML or binary.

For instance, you can return directly a value as plain text:

procedure TSQLRestServer.TimeStamp(Ctxt: TSQLRestServerURIContext);
begin
  Ctxt.Returns(Int64ToUtf8(ServerTimeStamp),HTML_SUCCESS,TEXT_CONTENT_TYPE_HEADER);
end;

Or you can return some binary file, retrieving the corresponding MIME type from its binary content:

procedure TSQLRestServer.GetFile(Ctxt: TSQLRestServerURIContext);
var fileName: TFileName;
    content: RawByteString;
    contentType: RawUTF8;
begin
  fileName :=  'c:\data\'+ExtractFileName(Ctxt.Input['filename']);
  content := StringFromFile(fileName);
  if content='' then
    Ctxt.Error('',HTML_NOTFOUND) else
    Ctxt.Returns(content,HTML_SUCCESS,HEADER_CONTENT_TYPE+
         GetMimeContentType(pointer(content),Length(content),fileName));
end;

The corresponding client method may be defined as such:

function TMyClient.GetFile(const aFileName: RawUTF8): RawByteString;
beginif CallBackGet('GetFile',['filename',aFileName],RawUTF8(result))<>HTML_SUCCESS thenraise Exception.CreateFmt('Impossible to get file: %s',[result]);
end;

If you use HTTP as communication protocol, you can consume these services, implemented Server-Side in fast Delphi code, with any AJAX application on the client side.

Using GetMimeContentType() when sending non JSON content (e.g. picture, pdf file, binary...) will be interpreted as expected by any standard Internet browser: it could be used to serve some good old HTML content within a page, not necessary consume the service via JavaScript .

Advanced process on server side

On server side, method definition has only one Ctxt parameter, which has several members at calling time, and publish all service calling features and context, including RESTful URI routing, session handling or low-level HTTP headers (if any).

At first, Ctxt may indicate the expected TSQLRecord ID and TSQLRecord class, as decoded from RESTful URI.
It means that a service can be related to any table/class of our ORM framework, so you would be able to create easily any RESTful compatible requests on URI like ModelRoot/TableName/ID/MethodName.
The ID of the corresponding record is decoded from its RESTful scheme into Ctxt.ID, and the table is available in Ctxt.Table or Ctxt.TableIndex (if you need its index in the associated server Model).

For example, here we return a BLOB field content as hexadecimal, according to its TableName/Id:

procedure TSQLRestServerTest.DataAsHex(Ctxt: TSQLRestServerURIContext);
var aData: TSQLRawBlob;
beginif (self=nil) or (Ctxt.Table<>TSQLRecordPeople) or (Ctxt.ID<0) then
    Ctxt.Error('Need a valid record and its ID') elseif RetrieveBlob(TSQLRecordPeople,Ctxt.ID,'Data',aData) then
    Ctxt.Results([SynCommons.BinToHex(aData)]) else
    Ctxt.Error('Impossible to retrieve the Data BLOB field');
end;

A corresponding client method may be:

function TSQLRecordPeople.DataAsHex(aClient: TSQLRestClientURI): RawUTF8;
begin
  Result := aClient.CallBackGetResult('DataAsHex',[],RecordClass,fID);
end;

If authentication is used, the current session, user and group IDs are available in Session / SessionUser / SessionGroup fields.
If authentication is not available, those fields are meaningless: in fact, Ctxt.Context.Session will contain either 0 (CONST_AUTHENTICATION_SESSION_NOT_STARTED) if any session is not yet started, or 1 (CONST_AUTHENTICATION_NOT_USED) if authentication mode is not active.
Server-side implementation can use the TSQLRestServer.SessionGetUser method to retrieve the corresponding user details (note that when using this method, the returned TSQLAuthUser instance is a local thread-safe copy which shall be freed when done).

In Ctxt.Call^ member, you can access low-level communication content, i.e. all incoming and outgoing values, including headers and message body.
Depending on the transmission protocol used, you can retrieve e.g. HTTP header information.
For instance, here is how you can access the caller remote IP address and client application user agent:

 aRemoteIP := FindIniNameValue(pointer(Ctxt.Call.InHead),'REMOTEIP: ');
 aUserAgent := FindIniNameValue(pointer(Ctxt.Call.InHead),'USER-AGENT: ');

Browser speed-up for unmodified requests

When used over a slow network (e.g. over the Internet), you can set the optional Handle304NotModified parameter of both Ctxt.Returns() and Ctxt.Results() methods to return the response body only if it has changed since last time.

In practice, result content will be hashed (using crc32 algorithm) and in case of no modification will return "304 Not Modified" status to the browser, without the actual result content.
Therefore, the response will be transmitted and received much faster, and will save a lot of bandwidth, especially in case of periodic server pooling (e.g. for client screen refresh).

Note that in case of hash collision of the crc32 algorithm (we never did see it happen, but such a mathematical possibility exists), a false positive "not modified" status may be returned; this option is therefore unset by default, and should be enabled only if your client does not handle any sensitive accounting process, for instance.

Be aware that you should disable authentication for the methods using this Handle304NotModified parameter, via a TSQLRestServer.ServiceMethodByPassAuthentication() call.
In fact, our RESTful authentication uses a per-URI signature, which change very often (to avoid men-in-the-middle attacks).
Therefore, any browser-side caching benefit will be voided if authentication is used: browser internal cache will tend to grow for nothing since the previous URIs are deprecated, and it will be a cache-miss most of the time.
But when serving some static content (e.g. HTML content, fixed JSON values or even UI binaries), this browser-side caching can be very useful.

Handling errors

When using Ctxt.Input*[] properties, any missing parameter will raise an EParsingException.
It will therefore be intercepted by the server process (as any other exception), and returned to the client with an error message containing the Exception class name and its associated message.

But you can have full access to the error workflow, if needed.
In fact, calling either Ctxt.Results(), Ctxt.Returns(), Ctxt.Success() or Ctxt.Error() will specify the HTTP status code (e.g. 200 / "OK" for Results() and Success() methods by default, or 400 / "Bad Request" for Error()) as an integer value.

For instance, here is how a service not returning any content can handle those status/error codes:

procedure TSQLRestServer.Batch(Ctxt: TSQLRestServerURIContext);
beginif (Ctxt.Method=mPUT) and RunBatch(nil,nil,Ctxt) then
    Ctxt.Success else
    Ctxt.Error;
end;

In case of an error on the server side, you may call Ctxt.Error() method (only the two valid status codes are 200 and 201).

The Ctxt.Error() method has an optional parameter to specify a custom error message in plain English, which will be returned to the client in case of an invalid status code.
If no custom text is specified, the framework will return the corresponding generic HTTP status text (e.g. "Bad Request" for default status code HTML_BADREQUEST = 400).

In this case, the client will receive a corresponding serialized JSON error object, e.g. for Ctxt.Error('Missing Parameter',HTML_NOTFOUND):

{
"ErrorCode":404,
"ErrorText":"Missing Parameter"
}

If called from an AJAX client, or a browser, this content should be easy to interpret.

Note that the framework core will catch any exception during the method execution, and will return a "Internal Server Error" / HTML_SERVERERROR = 500 error code with the associated textual exception details.

Benefits and limitations of this implementation

Method-based services allow fast and direct access to all mORMot Client-Server RESTful features, over all usual protocols of our framework: HTTP/1.1, Named Pipe, Windows GDI messages, direct in-memory/in-process access.

The mORMot implementation of method-based services gives full access to the lowest-level of the framework core, so it has some advantages:

It can be tuned to fit any purpose (such as retrieving or returning some HTML or binary data, or modifying the HTTP headers on the fly);
It is integrated into the RESTful URI model, so it can be related to any table/class of our ORM framework (like DataAsHex service above), or it can handle any remote query (e.g. any AJAX or SOAP requests);
It has a very low performance overhead, so can be used to reduce server workload for some common tasks.

Note that due to this implementation pattern, the mORMot service implementation is very fast, and not sensitive to the "Hash collision attack" security issue, as reported with Apache - see http://blog.synopse.info/post/2011/12/30/Hash-collision-attack for details.

But with this implementation, a lot of process (e.g. parameter marshalling) is to be done by hand on both client and server side code. In addition, building and maintaining a huge SOA system with a "method by method" approach could be difficult, since it publishes one big "flat" set of services.
This is were interfaces enter the scene.

See mORMot's interface-based services, which are even more user-friendly and easy to work with than those method-based services.

Full source code is available in our Source Code Repository.
It should work from Delphi 6 to Delphi XE5.

Feedback is welcome on our forum, as usual.

↧

Updated mORMot database benchmark - including MS SQL

November 4, 2013, 9:46 am

≫ Next: New Open Source Multi-Thread ready Memory Manager: SAPMM

≪ Previous: DataSnap-like Client-Server JSON RESTful Services in Delphi 6-XE5

On an recent notebook computer (Core i7 and SSD drive), depending on the back-end database interfaced, mORMot excels in speed:

You can persist up to 570,000 objects per second, or retrieve 870,000 objects per second (for our pure Delphi in-memory engine);
When data is retrieved from server or client cache, you can read more than 900,000 objects per second, whatever the database back-end is;
With a high-performance database like Oracle and our direct access classes, you can write 62,000 (via array binding) and read 92,000 objects per second, over a 100 MB network;
When using alternate database access libraries (e.g. Zeos, or DB.pas based classes), speed is lower, but still enough for most work.

Difficult to find a faster ORM, I suspect.

The following tables try to sum up all available possibilities, and give some benchmark (average objects/second for writing or read).

In these tables:

'SQLite3 (file full/off/exc)' indicates use of the internal SQLite3 engine, with or withoutSynchronous := smOff and/or DB.LockingMode := lmExclusive;
'SQLite3 (mem)' stands for the internal SQLite3 engine running in memory;
'SQLite3 (ext ...)' is about access to a SQLite3 engine as external database - either as file or memory;
'TObjectList' indicates a TSQLRestServerStaticInMemory instance, either static (with no SQL support) or virtual (i.e. SQL featured via SQLite3 virtual table mechanism) which may persist the data on disk as JSON or compressed binary;
'Oracle' shows the results of our direct OCI access layer (SynDBOracle.pas);
'NexusDB' is the free embedded edition, available from official site;
'Zeos *' indicates that the database was accessed directly via the ZDBC layer;
'FireDAC *' stands for FireDAC library;
'UniDAC *' stands for UniDAC library;
'BDE *' when using a BDE connection;
'ODBC *' for a direct access to ODBC;
'Jet' stands for a MSAccess database engine, accessed via OleDB;
'MSSQL local' for a local connection to a MS SQL Express 2008 R2 running instance (this was the version installed with Visual Studio 2010), accessed via OleDB.

This list of database providers is to be extended in the future. Any feedback is welcome!

Numbers are expressed in rows/second (or objects/second). This benchmark was compiled with Delphi 7, so newer compilers may give even better results, with in-lining and advanced optimizations.

Note that these tests are not about the relative speed of each database engine, but reflect the current status of the integration of several DB libraries within the mORMot database access.

Purpose here is not to say that one library or database is better or faster than another, but publish a snapshot of current mORMot persistence layer abilities.

In this timing, we do not benchmark only the "pure" SQL/DB layer access (SynDB units), but the whole Client-Server ORM of our framework: process below includes read and write RTTI access of a TSQLRecord, JSON marshaling, CRUD/REST routing, virtual cross-database layer, SQL on-the-fly translation. We just bypass the communication layer, since TSQLRestClient and TSQLRestServer are run in-process, in the same thread - as a TSQLRestServerDB instance. So you have here some raw performance testimony of our framework's ORM and RESTful core.

You can compile the "15 - External DB performance" supplied sample code, and run the very same benchmark on your own configuration.

Insertion speed

Here we insert 5,000 rows of data, with diverse scenarios:

'Direct' stands for a individual Client.Add() insertion;
'Batch' mode, has already been described in this blog;
'Trans' indicates that all insertion is nested within a transaction - which makes a great difference, e.g. with a SQlite3 database.

Benchmark was run on a Core i7 notebook, with standard SSD, including anti-virus and background applications, over a 100 Mb corporate network, linked to a shared Oracle 11g database. A local instance of MSSQLExpress 2008 R2 was running locally. So it was a development environment, very similar to low-cost production site, not dedicated to give best performance. During the process, CPU was noticeable used only for SQLite3 in-memory and TObjectList - most of the time, the bottleneck is not the CPU, but the storage or network. As a result, rates and timing may vary depending on network and server load, but you get results similar to what could be expected on customer side, with an average hardware configuration.

	Direct	Batch	Trans	Batch Trans
SQLite3 (file full)	489	475	87171	107402
SQLite3 (file off)	720	772	91627	114673
SQLite3 (file off exc)	28938	32642	92883	120612
SQLite3 (mem)	78823	101696	96478	122657
TObjectList (static)	321089	548365	312031	547105
TObjectList (virtual)	314366	513136	316676	571232
SQLite3 (ext full)	274	485	81570	120749
SQLite3 (ext off)	775	848	96811	123146
SQLite3 (ext off exc)	39638	42526	100924	134163
SQLite3 (ext mem)	73358	102597	91309	131540
FireDAC SQlite3	23861	47460	43404	129991
UniDAC SQlite3	469	457	27111	36783
ZEOS SQlite3	484	481	27866	30198
ZEOS Firebird	11325	12340	22216	25296
UniDAC Firebird	6387	7001	9059	10080
Jet	4092	4299	4774	4769
NexusDB	5998	6549	7668	8491
Oracle	643	66353	1351	55177
ODBC Oracle	651	639	1513	1565
BDE Oracle	489	511	1024	1003
ZEOS Oracle	517	512	1703	1858
FireDAC Oracle	687	44749	1668	46760
UniDAC Oracle	626	608	1315	1435
MSSQL local	6151	5821	12717	13322

Due to its ACID implementation, SQLite3 process on file waits for the hard-disk to have finished flushing its data, therefore it is the reason why it is slower than other engines at individual row insertion (less than 10 objects per second with a mechanical hardrive instead of a SDD) outside the scope of a transaction.

So if you want to reach the best writing performance in your application with the default engine, you should better use transactions and regroup all writing into services or a BATCH process. Another possibility could be to execute DB.Synchronous := smOff and/or DB.LockingMode := lmExclusive at SQLite3 engine level before process: in case of power loss at wrong time it may corrupt the database file, but it will increase the rate by a factor of 50 (with hard drive), as stated by the "off" and "off exc" rows of the table. Note that by default, the FireDAC library set both options, so results above are to be compared with "SQLite3 off exc" rows.

For both our direct Oracle access SynDBOracle.pas library and FireDAC, Batch process benefit of the array binding feature a lot (known as Array DML in FireDAC/AnyDAC).

Reading speed

Now the same data is retrieved via the ORM layer:

'By one' states that one object is read per call (ORM generates a SELECT * FROM table WHERE ID=? for Client.Retrieve() method);
'All *' is when all 5000 objects are read in a single call (i.e. running SELECT * FROM table from a FillPrepare() method call), either forced to use the virtual table layer, or with direct static call.

Here are some reading speed values, in objects/second:

	By one	All Virtual	All Direct
SQLite3 (file full)	25722	436147	435464
SQLite3 (file off)	25541	401800	420450
SQLite3 (file off exc)	111410	431294	422261
SQLite3 (mem)	114440	439483	449640
TObjectList (static)	303398	529661	799232
TObjectList (virtual)	308109	403323	871080
SQLite3 (ext full)	122558	244630	445037
SQLite3 (ext off)	121716	236406	442909
SQLite3 (ext off exc)	120200	243439	442477
SQLite3 (ext mem)	121589	240246	443144
FireDAC SQlite3	7715	85565	112989
UniDAC SQlite3	2270	75629	100240
ZEOS SQlite3	1865	95227	110367
ZEOS Firebird	15947	70409	89132
UniDAC Firebird	2150	73236	93396
Jet	2582	152262	224638
NexusDB	1413	120845	208246
Oracle	1617	96279	130955
ODBC Oracle	1589	37910	49028
BDE Oracle	860	3870	4036
ZEOS Oracle	1762	68124	79546
FireDAC Oracle	1290	55067	72251
UniDAC Oracle	854	33144	41386
MSSQL local	10160	201588	369521

The SQLite3 layer gives amazing reading results, which makes it a perfect fit for most typical ORM use. When running with DB.LockingMode := lmExclusive defined (i.e. "off exc" rows), reading speed is very high, and benefits from exclusive access to the database file. External database access is only required when data is expected to be shared with other processes.

In the above table, it appears that all libraries based on DB.pas are slower than the others for reading speed. In fact, TDataSet sounds to be a real bottleneck, due to its internal data marshalling. Even FireDAC, which is known to be very optimized for speed, is limited by the TDataSet structure. Our direct classes, or even ZEOS/ZDBC performs better, since they are able to output JSON content with no additional marshalling.

For both writing and reading, TObjectList / TSQLRestServerStaticInMemory engine gives impressive results, but has the weakness of being in-memory, so it is not ACID by design, and the data has to fit in memory. Note that indexes are available for IDs and stored AS_UNIQUE properties.

As a consequence, search of non-unique values may be slow: the engine has to loop through all rows of data. But for unique values (defined as stored AS_UNIQUE), both insertion and search speed is awesome, due to its optimized O(1) hash algorithm - see the following benchmark, especially the "By name" row for "TObjectList" columns, which correspond to a search of an unique RawUTF8 property value via this hashing method.

SQLite3 (file full)	SQLite3 (file off)	SQLite3 (mem)	TObjectList (static)	TObjectList (virt.)	SQLite3 (ext file full)	SQLite3 (ext file off)	SQLite3 (ext mem)	Oracle	Jet
By one	10461	10549	44737	103577	103553	43367	44099	45220	901	1074
By name	9694	9651	32350	70534	60153	22785	22240	23055	889	1071
All Virt.	167095	162956	168651	253292	118203	97083	90592	94688	56639	52764
All Direct	167123	144250	168577	254284	256383	170794	165601	168856	88342	75999

Above table results were run on a Core 2 duo laptop, so numbers are lower than with the previous tables.

Analysis and use case proposal

When declared as virtual table (via a VirtualTableRegister call), you have the full power of SQL (including JOINs) at hand, with incredibly fast CRUD operations: 100,000 requests per second for objects read and write, including serialization and Client-Server communication!

Some providers are first-class citizens to mORMot, like SQLite3, Oracle, or MS SQL. You can connect to them without the bottleneck of the DB.pas unit, nor any restriction of your Delphi license (a Starter edition is enough). For instance, SQLite3 could be used as main database engine for a client-server application with heavy concurrent access - if you have doubts about its scaling abilities, see this blog article. Direct access to Oracle is also available, with impressive results in BATCH mode (aka array binding). MS SQL Server, directly accessed via OleDB (or ODBC) gives pretty good timing, and a MS SQL Server 2008 R2 Express instance is a convincing option, for a very offerdable price (i.e. for free) - the LocalDB (MSI installer) edition is enough to start with. Any other OleDB, ODBC or ZDBC providers may also be used, with direct access. For instance, Firebird embedded gives pretty consistent timing, when accessed via Zeos/ZDBC.

But mORMot is very open-minded: you can use any DB.pas provider, e.g. FireDAC, UniDAC, DBExpress, NexusDB or even the BDE, but with the additional layer introduced by using a TDataSet instance, at reading.

Note that all those tests were performed locally and in-process, via a TSQLRestClientDB instance. For both insertion and reading, a Client-Server architecture (e.g. using HTTP/1.1 for mORMot clients) will give even better results for BATCH and retrieve all modes.

During the tests, internal caching was disabled, so you may expect speed enhancements for real applications, when data is more read than written: for instance, when an object is retrieved from the cache, you achieve more than 700,000 read requests per second, whatever database is used.

Therefore, the typical use may be the following:

Database	Created by	Use
int. SQLite3 file	default	General safe data handling, with amazing speed in "off exc" mode
int. SQLite3 mem	`:memory:`	Fast data handling with no persistence (e.g. for testing or temporary storage)
`TObjectList` static	`StaticDataCreate`	Best possible performance for small amount of data, without ACID nor SQL
`TObjectList` virtual	`VirtualTableRegister`	Best possible performance for small amount of data, if ACID is not required nor complex SQL
ext. SQLite3 file	`VirtualTableExternalRegister`	External back-end, e.g. for disk spanning
ext. SQLite3 mem	`VirtualTableExternalRegister`	Fast external back-end (e.g. for testing)
ext. Oracle / MS SQL / Firebird	`VirtualTableExternalRegister`	Fast, secure and industry standard; data can be shared outside mORMot
ext. NexusDB	`VirtualTableExternalRegister`	The free embedded version let the whole engine be included within your executable, and insertion speed is higher than SQLite3, so it may be a good alternative if your project mostly insert individual objects - using a batch within a transaction let SQlite3 be the faster engine
ext. Jet	`VirtualTableExternalRegister`	Could be used as a data exchange format (e.g. with Office applications)
ext. Zeos/FireDAC/UniDAC	`VirtualTableExternalRegister`	Allow access to several external engines, with some advantages for Zeos, since direct ZDBC access will by-pass the `DB.pas` unit and its `TDataSet` bottleneck - and we will also prefer an active Open Source project!

Whatever database back-end is used, don't forget that mORMot design will allow you to switch from one library to another, just by changing a TSQLDBConnectionProperties class type. And note that you can mix external engines, on purpose: you are not tied to one single engine, but the database access can be tuned for each ORM table, according to your project needs.

↧

New Open Source Multi-Thread ready Memory Manager: SAPMM

December 5, 2013, 1:00 pm

≫ Next: JSON record serialization

≪ Previous: Updated mORMot database benchmark - including MS SQL

Article edit/update/feedback:
SapMM did work fine for simple tests.
But after trying SapMM with our regression tests, it sounds like if it just crashes (just like SynScaleMM by the way), when it reaches the inteface-based service level.
IMHO this is not due to our own code, since when running memory-proof tools (like FastMM4 in full debugg mode), no corrupted block is identified.
We found out that only FastMM4 and ScaleMM2 are able to run all our regression tests (more than 14,000,000 individual checks, including a whole test coverage of our framework, in about 20 seconds!).
Sounds weird and disappointing. SapMM may not be as stable as expected - it is at the level of SynScaleMM, not a true alternative to FastMM4.
ScaleMM2 works pretty good, even if it eats a lot more memory than FastMM4, when we reach the multi-thread part of the regression tests (a threadpool of 50 threads is created several times for those tests, and at this level, global memory use of the process just raised up).
FastMM4 is still a good alternative, even in multi-threaded environment, since mORMot code patterns tries to limit memory allocation as much as possible, so contention is reduces as much as possible.

Initial article beginning:

Do you remember this former article about scalability of the Delphi memory manager, in multi-thread execution context?

Our SynScaleMM is still experimental.
But did pretty well, for an experiment!

At first, you can take a look at ScaleMM2, which is more stable, and based on the same ground.

But a new multi-thread friendly memory manager for Delphi just came out.
It is in fact the anonymous (and famous) "NN memory manager" Primož talked about in his article about string building and memory managers.

(Note that in this article, our SynScaleMM was found to be scaling very well, but on the other hand, Primož did compile its benchmark program in Debug mode, so our TTextWriter was not in good shape: when you compile in Release mode, optimizations and inlining are ON, and our good TTextWriter just flies... See the note at the beginning of the article - this is why I never find those benchmarks very informative. I always prefer profiling from the real world with real useful process… and was never convinced by any such naive benchmark.)
(Edit: these simple concatenation tests did not show up any instability problem of SapMM - alias NN - whereas our own mORMot regression tests were much more demanding - because closer to real use case. It won't help convincing me that such a naive benchmark would be very indicative and meaningful...)

OK, back to our business!

SapMM is an interesting beast.
https://code.google.com/p/sapmm

Sounds like if Alexei (the initial coder) has a C coding background. But that's fine when you have to deal with low-level structures and algorithms, as required by a memory manager.
It features everything we may ask for such a piece of code: clear design, optimized code (mostly by inlining process), memory leak reporting, some parameters for tuning.

It is only for Delphi XE (and up) under Win32 by now, but contributors are welcome!
It is used in production since more than half a year, and it passed all FastcodeMM benchmark tests.

If you want a direct link of the today's source code, without SVN, you may try this direct link from our site.
(but it probably will never be updated - you are warned)

You could take a comparison with the Memory Manager embedded with the FreePascalCompiler.
It has also a per-thread heap, with another implementation design. And it is now pretty stable and cross-platform!
It uses some nice FPC compiler tricks, like a prefetch() function which is quite unique and powerful when dealing with such low-level stuff like a memory manager.
The FPC guys did great job at the compiler level, and they do not forget to optimize the RTL in their work, which is pretty reassuring for the future - do you follow my mind?

↧

JSON record serialization

December 10, 2013, 1:02 pm

≫ Next: Domain-Driven Design: part 1

≪ Previous: New Open Source Multi-Thread ready Memory Manager: SAPMM

In Delphi, the record has some nice advantages:

record are value objects, i.e. accessed by value, not by reference - this can be very convenient, e.g. when defining a Domain-Driven Design;
record can contain any other record or dynamic array, so are very convenient to work with (no need to define sub-classes or lists);
record variables can be allocated on stack, so won't solicit the global heap;
record instances automatically freed by the compiler when they come out of scope, so you won't need to write any try..finally Free; end block.

Serialization of record values are therefore a must-have for a framework like mORMot.

In recent commits, this JSON serialization of record has been enhanced.
In particular, we introduced JSON serialization via a new text-based record definition.

Default Binary/Base64 serialization

By default, any record value will be serialized with a proprietary binary (and optimized) layout - i.e. via RecordLoad and RecordSave functions - then encoded as Base64, to be stored as plain text within the JSON stream.

A special UTF-8 prefix (which does not match any existing Unicode glyph) is added at the beginning of the resulting JSON string to identify this content as a BLOB, as such:

 { "MyRecord": "ï¿°w6nDoMOnYQ==" }

You will find in SynCommons unit both BinToBase64 and Base64ToBin functions, very optimized for speed.
Base64 encoding was chosen since it is standard, much more efficient than hexadecimal, and still JSON compatible without the need to escape its content.

When working with most part of the framework, you do not have anything to do: any record will by default follow this Base64 serialization, so you will be able e.g. to publish or consume interface-based services with records.

Custom serialization

Base64 encoding is pretty convenient for a computer (it is a compact and efficient format), but it is very limited about its interoperability. Our format is proprietary, and will use the internal Delphi serialization scheme: it means that it won't be readable nor writable outside the scope of your own mORMot applications. In a RESTful/SOA world, this sounds not like a feature, but a limitation.

Custom record JSON serialization can therefore be defined, as with any class.
It will allow writing and parsing record variables as regular JSON objects, ready to be consumed by any client or server. Internally, some callbacks will be used to perform the serialization.

In fact, there are two entry points to specify a custom JSON serialization for record:

When setting a custom dynamic array JSON serializer - the associated record will also use the same Reader and Writer callbacks;
By setting explicitly serialization callbacks for the TypeInfo() of the record, with the very same TTextWriter. RegisterCustomJSONSerializer method used for dynamic arrays.

Then the Reader and Writer callbacks can be defined by two means:

By hand, i.e. coding the methods with manual conversion to JSON text or parsing;
Via some text-based type definition, which will follow the record layout, but will do all the marshaling (including memory allocation) on its own.

Defining callbacks

For instance, if you want to serialize the following record:

  TSQLRestCacheEntryValue = record
    ID: integer;
    TimeStamp: cardinal;
    JSON: RawUTF8;
  end;

With the following code:

  TTextWriter.RegisterCustomJSONSerializer(TypeInfo(TSQLRestCacheEntryValue),
    TTestServiceOrientedArchitecture.CustomReader,
    TTestServiceOrientedArchitecture.CustomWriter);

The expected format will be as such:

 {"ID":1786554763,"TimeStamp":323618765,"JSON":"D:\\TestSQL3.exe"}

Therefore, the writer callback could be:

class procedure TTestServiceOrientedArchitecture.CustomWriter(
  const aWriter: TTextWriter; const aValue);
var V: TSQLRestCacheEntryValue absolute aValue;
begin
  aWriter.AddJSONEscape(['ID',V.ID,'TimeStamp',Int64(V.TimeStamp),'JSON',V.JSON]);
end;

In the above code, the cardinal field named TimeStamp is type-casted to a Int64: in fact, as stated by the documentation of the AddJSONEscape method, an array of const will handle by default any cardinal as an integer value (this is a limitation of the Delphi compiler). By forcing the type to be an Int64, the expected cardinal value will be transmitted, and not a wrongly negative versions for numbers > $7fffffff.

On the other side, the corresponding reader callback would be like:

class function TTestServiceOrientedArchitecture.CustomReader(P: PUTF8Char;
  var aValue; out aValid: Boolean): PUTF8Char;
var V: TSQLRestCacheEntryValue absolute aValue;
    Values: TPUtf8CharDynArray;
begin
  result := JSONDecode(P,['ID','TimeStamp','JSON'],Values);
  if result=nil then
    aValid := false else begin
    V.ID := GetInteger(Values[0]);
    V.TimeStamp := GetCardinal(Values[1]);
    V.JSON := Values[2];
    aValid := true;
  end;
end;

Text-based definition

Writing those callbacks by hand could be error-prone, especially for the Reader event.

You can use the TTextWriter.RegisterCustomJSONSerializerFromText method to define the record layout in a convenient text-based format.

The very same TSQLRestCacheEntryValue can be defined as with a typical pascalrecord:

const
  __TSQLRestCacheEntryValue = 'ID: integer; TimeStamp: cardinal; JSON: RawUTF8';

Or with a shorter syntax:

const
  __TSQLRestCacheEntryValue = 'ID integer TimeStamp cardinal JSON RawUTF8';

Both declarations will do the same definition. Note that the supplied text should match exactly the original record type definition: do not swap or forget any property!

By convention, we use two underscore characters (__) before the record type name, to easily identify the layout definition. It may indeed be convenient to write it as a constant, close to the record type definition itself, and not in-lined at RegisterCustomJSONSerializerFromText() call level.

Then you register your type as such:

  TTextWriter.RegisterCustomJSONSerializerFromText(
    TypeInfo(TSQLRestCacheEntryValue),__TSQLRestCacheEntryValue);

Now you are able to serialize any record value directly:

  Cache.ID := 10;
  Cache.TimeStamp := 200;
  Cache.JSON := 'test';
  U := RecordSaveJSON(Cache,TypeInfo(TSQLRestCacheEntryValue));
  Check(U='{"ID":10,"TimeStamp":200,"JSON":"test"}');

You can also unserialize some existing JSON content:

  U := '{"ID":210,"TimeStamp":2200,"JSON":"test2"}';
  RecordLoadJSON(Cache,@U[1],TypeInfo(TSQLRestCacheEntryValue));
  Check(Cache.ID=210);
  Check(Cache.TimeStamp=2200);
  Check(Cache.JSON='test2');

Note that this text-based definition is very powerful, and is able to handle any level of nested record or dynamic arrays.

By default, it will write the JSON content in a compact form, and will expect only existing fields to be available in the incoming JSON. You can specify some options at registration, to ignore all non defined fields. It can be very useful when you want to consume some remote service, and are interested only in a few fields.

For instance, we may define a client access to a RESTful service like api.github.com.

type
  TTestCustomJSONGitHub = packed record
    name: RawUTF8;
    id: cardinal;
    description: RawUTF8;
    fork: boolean;
    owner: record
      login: RawUTF8;
      id: cardinal;
    end;
  end;
  TTestCustomJSONGitHubs = array of TTestCustomJSONGitHub;

const
  __TTestCustomJSONGitHub = 'name RawUTF8 id cardinal description RawUTF8 '+
    'fork boolean owner{login RawUTF8 id cardinal}';

Note the format to define a nested record, as a shorter alternative to a nested record .. end syntax.

It is also mandatory that you declare the record as packed.
Otherwise, you may have unexpected access violation issues, since alignement may vary, depending on local setting, and compiler revision.

Now we can register the record layout, and provide some additional options:

  TTextWriter.RegisterCustomJSONSerializerFromText(TypeInfo(TTestCustomJSONGitHub),
    __TTestCustomJSONGitHub).Options := [soReadIgnoreUnknownFields,soWriteHumanReadable];

Here, we defined:

soReadIgnoreUnknownFields to ignore any non defined field in the incoming JSON;
soWriteHumanReadable to let the output JSON be more readable.

Then the JSON can be parsed then emitted as such:

var git: TTestCustomJSONGitHubs;
 ...
  U := zendframeworkJson;
  Check(DynArrayLoadJSON(git,@U[1],TypeInfo(TTestCustomJSONGitHubs))<>nil);U := DynArraySaveJSON(git,TypeInfo(TTestCustomJSONGitHubs));

You can see that the record serialization is auto-magically available at dynamic array level, which is pretty convenient in our case, since the api.github.com RESTful service returns a JSON array.

It will convert 160 KB of very verbose JSON information:

[{"id":8079771,"name":"Component_ZendAuthentication","full_name":"zendframework/Component_ZendAuthentication","owner":{"login":"zendframework","id":296074,"avatar_url":"https://1.gravatar.com/avatar/460576a0866d93fdacb597da4b90f233?d=https%3A%2F%2Fidenticons.github.com%2F292b7433472e2946c926bdca195cec8c.png&r=x","gravatar_id":"460576a0866d93fdacb597da4b90f233","url":"https://api.github.com/users/zendframework","html_url":"https://github.com/zendframework","followers_url":"https://api.github.com/users/zendframework/followers","following_url":"https://api.github.com/users/zendframework/following{/other_user}","gists_url":"https://api.github.com/users/zendframework/gists{/gist_id}","starred_url":"https://api.github.com/users/zendframework/starred{/owner}{/repo}",...

Into the much smaller (6 KB) and readable JSON content, containing only the information we need:

[
{
"name": "Component_ZendAuthentication",
"id": 8079771,
"description": "Authentication component from Zend Framework 2",
"fork": true,
"owner":
{
"login": "zendframework",
"id": 296074
}
},
{
"name": "Component_ZendBarcode",
"id": 8079808,
"description": "Barcode component from Zend Framework 2",
"fork": true,
"owner":
{
"login": "zendframework",
"id": 296074
}
},
...

During the parsing process, all unneeded JSON members will just be ignored.
The parser will jump the data, without doing any temporary memory allocation.
This is a huge difference with other existing Delphi JSON parsers, which first create a tree of all JSON values into memory, then allow to browse all the branches on request.

Note also that the fields have been ordered following the TTestCustomJSONGitHub record definition, which may not match the original JSON layout (here name/id fields order is inverted, and owner is set at the end of each item, for instance).

With mORMot, you can then access directly the content from your Delphi code as such:

if git[0].id=8079771 then begin
    Check(git[0].name='Component_ZendAuthentication');
    Check(git[0].description='Authentication component from Zend Framework 2');
    Check(git[0].fork=true);
    Check(git[0].owner.login='zendframework');
    Check(git[0].owner.id=296074);
  end;

Note that we do not need to use intermediate objects (e.g. via some obfuscated expressions like gitarray.Value[0].Value['owner'].Value['login']).
Your code will be much more readable, will complain at compilation if you misspell any field name, and will be easy to debug within the IDE (since the record layout can be easily inspected).

The serialization is able to handle any kind of nested record or dynamic arrays, including dynamic arrays of simple types (e.g. array of integer or array of RawUTF8), or dynamic arrays of record:

type
  TTestCustomJSONRecord = packed record
    A,B,C: integer;
    D: RawUTF8;
    E: record E1,E2: double; end;
    F: TDateTime;
  end;
  TTestCustomJSONArray = packed record
    A,B,C: integer;
    D: RawUTF8;
    E: array of record E1: double; E2: string; end;
    F: TDateTime;
  end;
  TTestCustomJSONArraySimple = packed record
    A,B: Int64;
    C: array of SynUnicode;
    D: RawUTF8;
  end;

The corresponding text definitions may be:

const
  __TTestCustomJSONRecord = 'A,B,C integer D RawUTF8 E{E1,E2 double} F TDateTime';
  __TTestCustomJSONArray  = 'A,B,C integer D RawUTF8 E[E1 double E2 string] F TDateTime';
  __TTestCustomJSONArraySimple = 'A,B Int64 C array of synunicode D RawUTF8';

Only the main Delphi simple types are handled by this feature (boolean byte word integer cardinal Int64 single double TDateTime TTimeLog string RawUTF8 SynUnicode WideString), then nested record or dynamic arrays. For a dynamic array nested property, you can use either the standard "array of" keywords, or the shorter [...fields...] - only for arrays of records, of course. For other types (like enumerations or sets), you can simply use the unsigned integer types corresponding to the binary value, e.g. byte word cardinal Int64 (depending on the sizeof() of the initial value).

You can refer to the supplied regression tests (in TTestLowLevelTypes.EncodeDecodeJSON) for some more examples of custom JSON serialization.

Feedback is welcome on our forum, as usual!

↧

Domain-Driven Design: part 1

January 4, 2014, 7:09 am

≫ Next: Domain-Driven Design: part 2

≪ Previous: JSON record serialization

One year ago, we already made a quick presentation of Domain-Driven Design, in the context of our mORMot framework.
After one year of real-world application of those patterns, and a training made by a great French software designer named Jérémie Grodziski, it is now time to give more light to DDD.

Let's start with part 1, which will be a general introduction to Domain-Driven Design, trying to state how it may be interesting (or not) for your projects.

Definition

http://domaindrivendesign.org gives the somewhat "official" definition of Domain-Driven design (DDD):

Over the last decade or two, a philosophy has developed as an undercurrent in the object community.
The premise of domain-driven design is two-fold:
For most software projects, the primary focus should be on the domain and domain logic;
Complex domain designs should be based on a model.
Domain-driven design is not a technology or a methodology. It is a way of thinking and a set of priorities, aimed at accelerating software projects that have to deal with complicated domains.

Of course, this particular architecture is customizable according to the needs of each project.
We simply propose following an architecture that serves as a baseline to be modified or adapted by architects according to their needs and requirements.

Patterns

In respect to other kinds of Multi-Tier architectures, DDD introduces some restrictive patterns, for a cleaner design:

Focus on the Domain - i.e. a particular kind of knowledge;
Define Bounded contexts within this domain;
Create an evolving Model of the domain, ready-to-be consummed by applications;
Identify some kind of objects - called Value objects or Entity Objects / Aggregates;
Use an Ubiquitous Language in resulting model and code;
Isolate the domain from other kind of concern (e.g. persistence should not be called from the domain layer - i.e. the domain should not be polluted by technical considerations, but rely on the Factory and Repository patterns);
Publish the domain as well-defined uncoupled Services;
Integrate the domain services with existing applications or legacy code.

The following diagram is a map of the patterns presented and the relationships between them.
It is inspired from the one included in the Eric Evans's reference book, "Domain-Driven Design", Addison-Wesley, 2004 (and updated to take in account some points appeared since).

You may recognize a lot of existing patterns you already met or implemented. What makes DDD unique is that those patterns have been organized around some clear concepts, thanks to decades of business software experiment.

Is DDD good for you?

Domain-Driven design is not to be used everywhere, and in every situation.

First of all, the following are pre-requisites of using DDD:

Identified and well-bounded domain (e.g. your business target should be clearly identified);
You must have access to domain experts to establish a creative collaboration, in an iterative (may be agile) way;
Skilled team, able to write clean code - note also that since DDD is more about code expressiveness than technology, it may not appear so "trendy" to youngest developers;
You want your internal team to accumulate knowledge of the domain - therefore, outsourcing may be constrained to applications, not the core domain.

Then check that DDD is worth it, i.e. if:

It helps you solving the problem area you are trying to address;
It meets your strategic goals: DDD is to be used where you will get your business money, and make you distinctive from your competitors;
You need to bring clarity, and need to solve inner complexity, e.g. modeling a lot of rules (you won't use DDD to build simple applications - in this case, RAD may be enough);
Your business is exploring: your goal is identified, but you do not know how to accomplish it;
Don't have all of these concerns, but at least one or two.

Introducing DDD

Perhaps DDD sounds more appealing to you now. In this case, our mORMot framework would provide all the bricks you need to implement it, focusing on your domain and letting the libraries do all the needed plumbing.
If you identified that DDD is not to be used now, you will always find with mORMot the tools you need, ready to switch to DDD when it would be necessary.

Legacy systems will benefit from DDD patterns. Finding so-called seams, along with isolating your core domain, can be extremely valuable when using DDD techniques to refactor and tighten the highest value parts of your code. It is not mandatory to re-write your whole existing software with DDD patterns everywhere: once you have identified where your business strategy's core is, you can introduce DDD progressively in this area. Then, following continuous feedback, you will refine your code, adding regression tests, and isolating your domain code from end-user code.

Let's continue with part 2, which will define Domain-Driven Design high-level model principles.

↧

Domain-Driven Design: part 2

January 4, 2014, 7:18 am

≫ Next: Domain-Driven Design: part 3

≪ Previous: Domain-Driven Design: part 1

Let's continue with part 2, which will define Domain-Driven Design high-level model principles.

Domain

What do we call Domain here?
The domain represents a sphere of knowledge, influence or activity.

As we already stated above, the domain has to be clearly identified, and your software is expected to solve a set of problems related to this domain.

DDD is some special case of Model-Driven Design.Its purpose is to create a model of a given domain.
The code itself will express the model: as a consequence, any code refactoring means changing the model, and vice-versa.

Modeling

Even the brightest programmer would never be able to convert a real-life domain into its software code.
What we can do is to create an abstraction system that describes selected aspects of a domain.

Modeling is about filtering the reality, for a given use context:
"All models are wrong, some are useful" G. Box, statistician.

Several Models to rule them all

As first consequence, several models may coexist for a given reality, depending of the knowledge level involved - what we call a Bounded Context.
Don't be afraid if the same reality is defined several times in your domain code: you should use only one class in a given context, but you may have another class defined in another context, with diverse attributes or methods.
Just open Google maps for instance, and think how the same reality may be modeled depending on the zoom level, or you current view options. See also the M1, M2, M3 models as defined in Meta-Object Facility. When you define several models, you just need to clearly state the current model you are using.

Even models could be abstracted. This is what DDD does: the code itself is some kind of meta-model, conforming a given conceptual model to the grammar of a given programming language.

The state of the model

Most models express the reality in two dimensions:

Static: to abstract a given state of the reality;
Dynamic: to abstract how reality evolves (i.e. its behavior).

In both dimensions, we can clearly understand the purpose of abstraction.

Since it is impossible to model all the details of reality (e.g. describe a physical reality down to atomic / sub-atomic level), the static modeling will forget the non significant details, and focus on the essentials, for a given knowledge level, which is specific to a given context.

Similarly, most changes are continuous in the world, but dynamic modeling will create static snapshots of the reality (called state transitions), to embrace the deterministic nature of computers.

State always brings complexity to the model. As a consequence, our code should be as stateless as possible.
Therefore:

Try to always separate value and time in state;
Reduce statefulness to the only necessary;
Implement your logic as state machines instead of blocking code or sessions;
Persistence should handle one-way transactions.

In DDD, Value Objects and Entity Objects are the mean to express a given system state. Immutable Value Objects define a static value. Entity refers to a given state of given identity (or reality).
For instance, the same identity (named "John Doe") may be, at a given state, single and minor, then, at another state, married and adult. The model will help to express the given states, and the state transitions between them (e.g. John's marriage).

In DDD, the Factory / Repository / Unit Of Work patterns will introduce transactional support in a stateless approach.

And in situations where a reality does change its state very often, with complex impacts on other components, DDD would model these state changes as Events. It could lead into introducing some Event-Driven Design even or Event Sourcing within the global model.

Composition

In order to refine your model, you have two main tools at hand to express the model modularity:

Partitioning: the more your elements have a separated concern, the better;
Grouping: to express constraints, elements may be grouped - but usually, you should not put more than 6 or 8 elements in the same diagram, or your model may need to be refined.

In DDD, a lot of small objects have to be defined, in order to properly partition the logic. When we start with Object Oriented Programming, we are tempted to create huge classes with a lot of methods and parameters. This is a symptom of a weak model. We should always favor composition of small simple objects, just like the Unix tools philosophy or the Single Responsibility Principle.

Some DDD experts also do not favor inheritance. In fact, inheriting may be also a symptom of some coupled context. Having two diverse realities sharing properties may be a bad design smell: if two or more classes inherit from one parent class, the state and behavior of the parent class may limit any future evolution of any of its children. In practice, trying to follow the Open/Close SOLID principle at class level may induce unexpected complexity, therefore reducing code maintainability.

In DDD, the Aggregate Root is how you group your objects, in order to let constraints (e.g. business rules) to be modeled. Aggregates are the main entry point to the domain, since they should contain, by design, the whole execution context of a given process. Their extent may vary during development, e.g. when a business rule evolves - remember that the same reality can appear several times in the same domain, but once per Bounded Context. In other words, Aggregates could be seen as the smallest and biggest extent needed to express a given model context.

Let's continue with part 3, which will define Domain-Driven Design main patterns and principles.

↧

Domain-Driven Design: part 3

January 4, 2014, 7:23 am

≫ Next: Domain-Driven Design: part 4

≪ Previous: Domain-Driven Design: part 2

Let's continue with part 3, which will define Domain-Driven Design patterns and principles - this will be the main article of the whole serie!

Ubiquitous Language

Ubiquitous Language is where DDD begins.

DDD expects the domain model to be expressed via a shared language, and used by all team members to connect their activities with the software. Those terms should be used in speech, writing, and any presentation or diagram.

In the real outside world, i.e. for the other 10th kind of people how do not know about binary, domain experts use company- or industry-standard terminology.

As developers, we have to understand this vocabulary and not only use it when speaking with domain experts but also see the same terminology reflected in our code. If the terms "class code" or "rate sets" or "exposure" are frequently used in conversation, we shall find corresponding class names in the code. In DDD, it is critical that developers use the business language in code consciously and as a disciplined rule. As a consequence, browsing the code should lead into a clear comprehension of the business model.

Domain experts will be the guard keepers of the consistency of this language, and its proper definition. Even if the terms are expected to be consistent, they are not to be written in stone, especially during the initial phase of software development. As soon as one domain activity cannot be expressed using the existing set of concepts, the model needs to be extended. Removing ambiguities and inconsistencies is a need, and will, very often, resolve several not-yet-identified software issues.

Value Objects and Entities

For the definition of your objects or internal data structures (what good programmers care about), you are encouraged to make a difference between several kind of objects. Following DDD, model-level representation are, generally speaking, rich on behavior, therefore also of several families/species of objects.

Let us list the most high-level definitions of objects involved to define our DDD model:

Value Objects contain attributes (value, size) but no conceptual identity - e.g. money bills, or seats in a Rock concert, as they are interchangeable;
Entity objects are not defined by their attributes (values), but by their thread of continuity, signified by an identity - e.g. persons, or seats in most planes, as each one is unique and identified.

The main difference between Value Objects and Entities is that instances of the second type are tied to one reality, which evolves in the time, therefore creating a thread of continuity.

Value objects are immutable by definition, so should be handled as read-only. In other words, they are incapable of change once they are created.
Why is it important that they be immutable? With Value objects, you're seeking side-effect-free functions, yet another concept borrowed by DDD to functional languages (and not available in most OOP languages, until latest concurrent object definition like in Rust or Immutable Collections introduced in C#/.NET 4.5). When you add $10 to $20, are you changing $20? No, you are creating a new money descriptor of $30. A similar behavior should be visible at code level.

Entities will very likely have an ID field, able to identify a given reality, and model the so-called thread of continuity of this identity. But this ID is an implementation detail, only used at Persistence Layer level: at the Domain Layer level, you should not access Entities individually, but via a special Entity bounded to a specific context, called Aggregate Root (see next paragraph).

When we define some objects, we should focus on making the implicit become explicit. For instance, if we have to store a phone number, we won't use a plain string type for it, but we should create a dedicated Value object type, making explicit all the behavior of its associated reality. Then we will be free to combine all types into explicit grouped types, on need.

Aggregates

Aggregates are a particular case of Entities, defined as collection of objects (nested Values and/or Entities) that are grouped together by a root Entity, otherwise known as an Aggregate Root, which scope has been defined by a given execution context - see "Composition" above.

Typically, Aggregates are persisted in a database, and guarantee the consistency of changes by isolating its members from external objects (i.e. you can link to an aggregate via its ID, but you can not directly access to its internal objects). See the Shared nothing architecture (or sharding) which sounds like an Aggregate-Oriented Database

In practice, Aggregates may be the only kind of objects which will be persisted at the Application layer, before calling the domain methods: even if each nested Entity may have its own persistence method (e.g. one RDBMS table per Entity), Aggregates may be the unique access point to retrieve or update a given state. It will ensure so-called Persistence Ignorance, meaning that domain should remain uncoupled to any low-level storage implementation detail.

DDD services may just permit remote access to Aggregates methods, where the domain logic will be defined and isolated.

Factory and Repository patterns

DDD then favors some patterns to use those objects efficiently.

The Factory pattern is used to create object instances. In strongly-typed OOP (like in Delphi, Java or C#), this pattern is in fact its constructor method and associated class type definition, which will define a fixed set of properties and methods at compilation time (this is not the case e.g. in JavaScript or weak-typed script languages, in which you can add methods and properties at runtime).

The Factory pattern can also be used to create interface instances. Main benefit is that alternative implementations may be easily interchanged. Such abstraction helps testing but also introduces interface-based services.

Repository pattern is used to save and dispense each Aggregate Root.
It matches the "Layer Supertype" pattern (see above), e.g. via our mORMotTSQLRecord and TSQLRest classes and their Client-Server ORM features, or via dedicated repository classes - saving data is indeed a concern orthogonal to the model itself. DDD architects claim that persistence is infrastructure, not domain. You may benefit in defining your own repository interface, if the standard ORM / CRUD operations are not enough.

DTO and Events

In addition to these domain-level objects, some cross-cutting types may appear, especially at Application layer and Presentation layer:

Data Transfer Objects (DTO) are transmission objects, which purpose is to not send your domain across the wire (i.e. separate your layers, following the Anti-Corruption Layer pattern). It encourages you to create gatekeepers that work to prevent non-domain concepts from leaking into your model.
Commands and Events are some kind of DTO, since they communicate data about an event and they themselves encapsulate no behavior - in mORMot, we try to let the framework do all the plumbing, letting those types be implemented via interfaces, avoiding the need to define them by hand.

Those kind of objects are needed to isolate the domain from the outer world. But if your domain is properly defined, most of your Value Objects may be used with no translation, so could be used as DTO classes. Even Entities may be transmitted directly, since their methods should not refer to nothing but their internal properties, so may be of some usefulness outside the domain itself.
Only the Aggregates should better be isolated and stay at the Application layer, given access to its methods and nested objects via proper high-level remote Services.

Services

Aggregate roots (and sometimes Entities), with all their methods, often end up as state machines, and the behavior matches accordingly.
In the domain, since Aggregate roots are the only kind of entities to which your software may hold a reference, they tend to be the main access point of any process. It could be handy to publish their methods as stateless Services, isolated at Application layer level.

Domain services pattern is used to model primary operations.
Domain Services give you a tool for modeling processes that do not have an identity or life-cycle in your domain, that is, that are not linked to one aggregate root, perhaps none, or several. In this terminology, services are not tied to a particular person, place, or thing in my application, but tend to embody processes. They tend to be named after verbs or business activities that domain experts introduce into the so-called Ubiquitous Language. If you follow the interface segregation principle, your domain services should be exposed as dedicated client-oriented methods. Do not leak your domain! In DDD, you develop your Application layer services directly from the needs of your client applications, letting the Domain layer focus on the business logic.

Unit Of Work can be used to maintain a list of objects affected by a business transaction and coordinates the writing out of changes and the resolution of concurrency problems.
In short, it implements transactional process at Domain level, and may be implemented either at service or ORM level. It features so-called Persistence Ignorance, meaning that your domain code may not be tied to a particular persistence implementation, but "hydrate" Aggregate roots class instances as abstractly as possible.

Clean Uncoupled Architecture

If you follow properly the DDD patterns, your classic n-Tier architecture will evolve into a so-called Clean Architecture or Hexagonal architecture.

Even if physically, this kind of architecture may still look like a classic layered design (with presentation on the top, business logic in the middle and a database at the bottom - and in this case we speak of N-Layered Domain-Oriented Architecture), DDD tries to isolate the Domain Model from any dependency, including technical details.

As a consequence, the logical architecture of any DDD solution should appear as such:

That kind of architecture is not designed in layers any more, but more like an Onion.

At the core of the bulb - sorry, of the system, you have the Domain Model.
It implements all Value Objects and Entity Objects, including their state and behavior, and associated unit tests.

Around this core, you find Domain Services which add some more behavior to the inner model.
Typically, you will find here abstract interfaces that provides persistence (Aggregates saving and retrieving via the Repository pattern), let Domain objects properties and methods be defined (via the Factory pattern), or access to third-party services (for service composition in a SOA world, or e.g. to send a notification email).

Then Application Services will define the workflows of all end-user applications.
Even if the core Domain is to be as stable as possible, this outer layer is what will change more often, depending on the applications consuming the Domain Services. Typically, workflows will consist in deshydrating some Aggregates via the Repository interface, then call the Domain logic (via its objects methods, or for primary operations with wider Domain services), call any external service, and validate ("commit", following Unit-Of-Work or transactional terms) objects modifications.

Out on the edges you see User Interface, Infrastructure (including e.g. database persistence), and Tests. This outer layer is separated from the other three internal layers, which are sometimes called Application Core.
This is where all technical particularities will be concentrated, e.g. where RDBMS / SQL / ORM mapping will be defined, or platform-specific code will reside. This is the right level to test your end-user workflows, e.g. using Behavior-Driven Development (abbreviated BDD), with the help of your Domain experts.

The premise of this Architecture is that it controls coupling. The main rule is that all coupling is toward the center: all code can depend on layers more central, but code cannot depend on layers further out from the core. This is clearly stated in the above diagram: just follow the arrows, and you will find out the coupling order. This architecture is unashamedly biased toward object-oriented programming, and it puts objects before all others.

This Clean Architecture relies heavily on the Dependency Inversion principle. It emphasizes the use of interfaces for behavior contracts, and it forces the externalization of infrastructure to dedicated implementation classes. The Application Core needs implementation of core interfaces, and if those implementing classes reside at the edges of the application, we need some mechanism for injecting that code at runtime so the application can do something useful. mORMot's Client-Server features provide all needed process to access, even remotely, e.g. to persistence or any third party services, in an abstract way.

With Clean Architecture, the database is not the center of your logic, nor the bottom of your physical design - it is external. Externalizing the database can be quite a challenge for some people used to thinking about applications as "database applications", especially for Delphi programmers with a RAD / TDataSet background. With Clean Architecture, there are no database applications. There are applications that might use a database as a storage service but only though some external infrastructure code that implements an interface which makes sense to the application core. The domain could be even decoupled from any ORM pattern, if needed. Decoupling the application from the database, file system, third party services and all technical details lowers the cost of maintenance for the life of the application, and allows proper testing of the code, since all Domain interfaces can be mocked on purpose.

Let's continue with part 4, which will define Domain-Driven Design as could be implemented with our Synopse mORMot framework.

↧

Domain-Driven Design: part 4

January 4, 2014, 7:47 am

≫ Next: AES encryption over HTTP

≪ Previous: Domain-Driven Design: part 3

Let's continue with part 4, which will define Domain-Driven Design as could be implemented with our Synopse mORMot framework

Designer's commitments

Before going a bit deeper into the low-level stuff, here are some key sentences we should better often refer to:

I shall collaborate with domain experts;
I shall focus on the ubiquitous language;
I shall not care about technical stuff or framework, but about modeling the Domain;
I shall make the implicit explicit;
I shall use end-user scenarios to get real and concrete;
I shall not be afraid of defining one model per context;
I shall focus on my Core Domain;
I shall let my Domain code uncoupled to any external influence;
I shall separate values and time in state;
I shall reduce statefulness to the only necessary;
I shall always adapt my model as soon as possible, once it appears inadequate.

As a consequence, you will find in mORMot no magic powder to build your DDD, but all the tools you need to focus on your business, without loosing time in re-inventing the wheel, or fixing technical details.

Defining objects in Delphi

How to implement all those DDD concepts in an object-oriented language like Delphi?
Let's go back to the basics. Objects are defined by a state, a behavior and an identity. A factory helps creating objects with the same state and behavior.

In Delphi and most Object-Oriented languages (OOP - including C# or Java) each class instance (always inheriting from TObject):

State is defined by all its property / member values;
Behavior are defined by all its methods;
Identity is defined by reference, i.e. a=b is true only if a and b refers to the same object;
Factory is in fact the class type definition itself, which will force each instance to have the same members and methods.

In Delphi, the record type (and deprecated object type for older versions of the compiler) has an alternative behavior:

State is also defined by all its property / member values;
Behavior are also defined by all its methods;
But identity is defined by content, i.e. RecordEquals(a,b) is true only if a and b have the same exact property values;
Factory is in fact the record / object type definition itself, which will force each instance to have the same members and methods.

We propose to use either one of the two kinds of object types, depending on the behavior expected by DDD patterns.

Defining DDD objects in mORMot

DDD's Value Objects are probably meant to be defined as record, with methods (i.e. in this case as object for older versions of Delphi). You may also use TComponent or TSQLRecord classes, ensuring the published properties do not have setters but just read F... definition, to make them read-only, and, at the same time, directly serializable.
If you use record / object types, you may need to customize the JSON serialization when targeting AJAX clients (by default, records are serialized as binary + Base64 encoding, but you can define easily the record serialization e.g. from text). Note that since record / object defines in Delphi by-value types (whereas class defines by-reference types - see previous paragraph), they are probably the cleanest way of defining Value Objects.

DDD's Entity objects could be either regular Delphi classes, or inherit from TSQLRecord:

Using PODOs (Plain Old Delphi Object - see so-called POJO or POCO for Java or C#) has some advantages. Since your domain has to be uncoupled from the rest of your code, using plain class helps keeping your code clean and maintainable.
Inheriting from TSQLRecord will give it access to a whole set of methods supplied by mORMot. It will implement the "Layer Supertype" pattern, as explained by Martin Fowler.

DDD's Aggregates may either benefit of using mORMot's 3, or you can use a repository service.

In the first case, your aggregate roots will be defined as TSQLRecord, and you will benefit of all CRUD methods made available by the framework;
Otherwise, you should define a dedicated persistence service, then use plain DTO (like Delphi record) or even publish the TSQLRecord types, and benefit of their automated serialization.

In all cases, when defining domain objects, we should always make the implicit explicit, i.e. defining one type (either record/object or class) per reality in the model.
Thanks to Delphi's strong typing, it will ensure that the Domain Ubiquitous language will appear in the code.

DDD's DTO may also be defined as record, and directly serialized as JSON via text-based serialization. Don't be afraid of writing some translation layers between TSQLRecord and DTO records or, more generally, between your Application layer and your Presentation layer. It will be very fast, on the server side. If your service interfaces are cleaner, do not hesitate. But if it tends to enforce you writing a lot of wrapping code, forget about it, and expose your Value Objects or even your Entities, as stated above. Or automate the wrapper coding, using RTTI and code generators. You have to weight the PROs and the CONs, like always...

DDD's Events should be defined also as record, just like regular DTOs. Note that in the close future, it is planned that mORMot will allow such events to be defined as interface, in a KISS implementation.

mORMot's BATCH support is a convenient implementation of the Unit of Work pattern (i.e. regrouping all update / delete / insert operations in a single stream, with global Commit and Rollback methods). Note that the current implementation of Batch* methods in mORMot, which focuses on Client side, should be enhanced to be more convenient and available on the server side, i.e. in the Application Layer.

Defining services

In practice, mORMot's Client-Server architecture may be used as such:

Services via methods can be used to publish methods corresponding to your aggregate roots defined as TSQLRecord.
This will make it pretty RESTful compatible.
Services via interfaces can be used to publish all your processes.
Dedicated factories can be used on both Client and Server side, to define your repositories and/or domain operations.

Both methods will allow proper customization, and, especially for the second, offer both integrated and automated process, e.g. RESTful access, JSON marshalling, session, security, logging, multi-threading.

Building a Clean architecture

A common DDD architecture is expressed as in the following model, which may look like a regular multi-Tier design at first, but should be implemented as a Clean Architecture.

Layer	Description
Presentation	MVC UI generation and reporting
Application	Services and high-level adapters
Domain Model	Where business logic remains
Data persistence	ORM and external services
Cross-Cutting	Horizontal aspects shared by other layers

Physically, it involves a common n-Tier representation splitting the classical Logic Tier into two layers, i.e. Application layer and Domain Model layer. At logical level, DDD will try to uncouple the Domain Model layer from other layers, so the code itself will rely on interfaces and dependency injection to let the core Domain focus on the business logic, not on implementation details (e.g. persistence or communication).

This is what we called a Clean Architecture, defined as such:

The RESTful SOA components of our Synopse mORMot framework can therefore define such an Architecture:

As we already stated, the main point of this Clean Architecture is to control coupling, and isolate the Domain core from the outer layers. In Delphi, unit dependencies (as displayed e.g. by our SynProject tool) will be a good testimony of proper objects uncoupling: in the units defining your domain, you may split it between Domain Model and Domain Services (the 2nd using the first, and not vice-versa), and you should never have any dependency to a particular DB unit, just to the framework's core units, i.e. SynCommons.pas and mORMot.pas. Inversion of Control - via interface-based services or at ORM initialization level - will ensure that your code is uncoupled from any low-level technical dependency. It will also allow proper testing of your application workflows, e.g. stubbing the database if necessary.

In fact, since SOA tends to ensure that services comprise unassociated, loosely coupled units of functionality that have no calls to each other embedded in them, we may define two levels of services, implemented by two interface factories, using their own hosting and communication:

One set of services at Application layer, to define the uncoupled contracts available from Client applications;
One set of services at Domain Model layer, which will allow all involved domains to communicate with each other, without exposing it to the remote clients.

In order to provide the better scaling of the server side, cache can be easily implemented at every level, and hosting can be tuned in order to provide the best response time possible: one central server, several dedicated servers for application, domain and persistence layers...

Due to the SOLID design of mORMot you can use as many Client-Server services layers as needed in the same architecture (i.e. a Server can be a Client of other processes), in order to fit your project needs, and let it evolve from the simplest architecture to a full scalable Domain-Driven design.

↧

AES encryption over HTTP

January 5, 2014, 8:42 am

≫ Next: Some enhancements to REST routing of interface-based services

≪ Previous: Domain-Driven Design: part 4

In addition to regular HTTPS flow encryption, which is not easy to setup due to the needed certificates, mORMot proposes a proprietary encryption scheme. It is based on SHA-256 and AES-256/CTR algorithms, so is known to be secure.

You do not need to setup anything on the server or the client configuration, just run the TSQLHttpClient and TSQLHttpServer classes with the corresponding parameters.

Note that this encryption uses a global key for the whole process, which should match on both Server and Client sides. You should better hard-code this public key in your Client and Server Delphi applications, with some variants depending on each end-user service. You can use CompressShaAesSetKey() as defined in SynCrypto.pas to set globally this Encryption Key, and an optional Initialization Vector. You can even customize the AES chaining mode, if the default TAESCTR mode is not what you expect.

When the aHttpServerSecurity parameter is set to secSynShaAes for the TSQLHttpServer.Create() constructor, this proprietary encryption will be enabled on the server side. For instance:

 MyServer := TSQLHttpServer.Create('888',[DataBase],'+',useHttpApiRegisteringURI,32,secSynShaAes);

On the client side, you can just set the TSQLHttpClientGeneric.Compression property as expected:

 MyClient.Compression := [hcSynShaAes];

Once those parameters have been set, a new proprietary encoding will be defined in the HTTP headers:

 ACCEPT-ENCODING: synshaaes

Then all HTTP body content will be compressed via our SynLZ algorithm, and encoded using the very secure AES-CTR/256 encryption.

Since it is a proprietary algorithm, it will work only for Delphi clients. When accessing for a plain AJAX client, or a Delphi application with TSQLHttpClientGeneric.Compression = [], there won't be any encryption at all, due to way HTTP accepts its encoding.
For safety, you should therefore use it in conjunction with our per-URI Authentication.

Feedback is welcome on our forum, as usual!

↧

Some enhancements to REST routing of interface-based services

January 7, 2014, 7:06 am

≫ Next: REpresentational State Transfer (REST)

≪ Previous: AES encryption over HTTP

We have just committed some enhancements to interface-based service process.

TSQLRestRoutingREST will now recognize several URI schemes, like new root/Calculator/Add?n1=1&n2=2 alternative could be pretty convenient to be consumed from some REST clients.

Please find here a documentation update.

Transmission content

All data is transmitted as JSON arrays or objects, according to the requested URI.

We'll discuss how data is expected to be transmitted, at the application level.

Request format

As stated above, there are several available modes of routing, defined by a given class, inheriting from TSQLRestServerURIContext:

`TSQLRestRoutingREST`	`TSQLRestRoutingJSON_RPC`
Description	RESTful mode	JSON-RPC mode
Default	Yes	No
URI scheme	`/Model/Interface.Method[/ClientDrivenID]`or `/Model/Interface/Method[/ClientDrivenID]`+ optional URI-encoded params	`/Model/Interface`
Body content	JSON array of parameters or void if parameters were encoded at URI	`"method":"MethodName", "params":[...] [,"id":ClientDrivenID]`
Security	RESTful authentication for each method or for the whole service (interface)	RESTful authentication for the whole service (interface)
Speed	10% faster	10% slower

The routing to be used is defined globally in the TSQLRest.ServiceRouting property, and should match on both client and server side, of course. Of course, you should never assign the abstract TSQLRestServerURIContext to this property.

REST mode

In the default TSQLRestRoutingREST mode, both service and operation (i.e. interface and method) are identified within the URI. And the message body is a standard JSON array of the supplied parameters (i.e. all const and var parameters).

Here is typical request for ICalculator.Add:

 POST /root/Calculator.Add
(...)
[1,2]

Here we use a POST verb, but the framework will also allows other methods like GET, if needed (e.g. from a regular browser). The pure Delphi client implementation will use only POST.

For a sicClientDriven mode service, the needed instance ID is appended to the URI:

 POST /root/ComplexNumber.Add/1234
(...)
[20,30]

Here, 1234 is the identifier of the server-side instance ID, which is used to track the instance life-time, in sicClientDriven mode. One benefit of transmitting the Client Session ID within the URI is that it will be more secure in our RESTful authentication scheme: each method (and even any client driven session ID) will be signed properly.

In this TSQLRestRoutingREST mode, the server is also able to retrieve the parameters from the URI, if the message body is left void. This is not used from a Delphi client (since it will be more complex and therefore slower), but it can be used for a client, if needed:

 POST root/Calculator.Add?+%5B+1%2C2+%5D
GET root/Calculator.Add?+%5B+1%2C2+%5D

In the above line, +%5B+1%2C2+%5D will be decoded as [1,2] on the server side. In conjunction with the use of a GET verb, it may be more suitable for a remote AJAX connection.

As an alternative, you can encode and name the parameters at URI level, in a regular HTML fashion:

 GET root/Calculator.Add?n1=1&n2=2

Since parameters are named, they can be in any order. And if any parameter is missing, it will be replaced by its default value (e.g. 0 for a number or '' for a string).

This may be pretty convenient for simple services, consummed from any kind of client.

Note that there is a known size limitation when passing some data with the URI over HTTP. Official RFC 2616 standard advices to limit the URI size to 255 characters, whereas in practice, it sounds safe to transmit up to 2048 characters within the URI. If you want to get rid of this limitation, just use the default transmission of a JSON array as request body.

As an alternative, the URI can be written as /RootName/InterfaceName/MethodName. It may be more RESTful-compliant, depending on your client policies. The following URIs will therefore be equivalent to the previous requests:

 POST /root/Calculator/Add
POST /root/ComplexNumber/Add/1234
POST root/Calculator/Add?+%5B+1%2C2+%5D
GET root/Calculator/Add?+%5B+1%2C2+%5D
GET root/Calculator/Add?n1=1&n2=2

From a Delphi client, the /RootName/InterfaceName.MethodName scheme will always be used.

JSON-RPC

If TSQLRestRoutingJSON_RPC mode is used, the URI will define the interface, and then the method name will be inlined with parameters, e.g.

 POST /root/Calculator
(...)
{"method":"Add","params":[1,2],"id":0}

Here, the "id" field can be not set (and even not existing), since it has no purpose in sicShared mode.

For a sicClientDriven mode service:

 POST /root/ComplexNumber
(...)
{"method":"Add","params":[20,30],"id":1234}

This mode will be a little bit slower, but will probably be more AJAX ready.

It's up to you to select the right routing scheme to be used.

Feedback is welcome in our forum, as usual!

↧

REpresentational State Transfer (REST)

January 10, 2014, 7:58 am

≫ Next: RESTful mORMot

≪ Previous: Some enhancements to REST routing of interface-based services

Representational state transfer (REST) is a style of software architecture for distributed hypermedia systems such as the World Wide Web.
As such, it is not just a method for building "web services". The terms "representational state transfer" and "REST" were introduced in 2000 in the doctoral dissertation of Roy Fielding, one of the principal authors of the Hypertext Transfer Protocol (HTTP) specification, on which the whole Internet rely.

There are 5 basic fundamentals of web which are leveraged to create REST services:

Everything is a Resource;
Every Resource is Identified by a Unique Identifier;
Use Simple and Uniform Interfaces;
Communication is Done by Representation;
Every Request is Stateless.

Resource-based

Internet is all about getting data. This data can be in a format of web page, image, video, file, etc.
It can also be a dynamic output like get customers who are newly subscribed.
The first important point in REST is start thinking in terms of resources rather than physical files.

You access the resources via some URI, e.g.

http://www.mysite.com/pictures/logo.png - Image Resource;
http://www.mysite.com/index.html - Static Resource;
http://www.mysite.com/Customer/1001 - Dynamic Resource returning XML or JSON content;
http://www.mysite.com/Customer/1001/Picture - Dynamic Resource returning an image.

Unique Identifier

Older web techniques, e.g. aspx or ColdFusion, did request a resource by specifying parameters, e.g.

 http://www.mysite.com/Default.aspx?a=1;a=2&b=1&a=3

In REST, we add one more constraint to the current URI: in fact, every URI should uniquely represent every item of the data collection.

For instance, you can see the below unique URI format for customer and orders fetched:

Customer data	URI
Get Customer details with name "dupont"	`http://www.mysite.com/Customer/dupont`
Get Customer details with name "smith"	`http://www.mysite.com/Customer/smith`
Get orders placed by customer "dupont"	`http://www.mysite.com/Customer/dupont/Orders`
Get orders placed by customer "smith"	`http://www.mysite.com/Customer/smith/Orders`

Here, "dupont" and "smith" are used as unique identifiers to specify a customer.
In practice, a name is far from unique, therefor most systems use an unique ID (like an integer, an hexadecimal number or a GUID).

Interfaces

To access those identified resources, basic CRUD activity is identified by a set of HTTP verbs:

HTTP method	Action
GET	List the members of the collection (one or several)
PUT	Update a member of the collection
POST	Create a new entry in the collection
DELETE	Delete a member of the collection

Then, at URI level, you can define the type of collection, e.g. http://www.mysite.com/Customer to identify the customers or http://www.mysite.com/Customer/1234/Orders to access a given order.

This combinaison of HTTP method and URI replace a list of English-based methods, like GetCustomer / InsertCustomer / UpdateOrder / RemoveOrder.

By Representation

What you are sending over the wire is in fact a representation of the actual resource data.

The main representation schemes are XML and JSON.

For instance, here is how a customer data is retrieved from a GET method:

<Customer>
<ID>1234</ID>
<Name>Dupond</Name>
<Address>Tree street</Address>
</Customer>

Below is a simple JSON snippet for creating a new customer record with name and address:

 {Customer: {"Name":"Dupont", "Address":"Tree street"}}

As a result to this data transmitted with a POST command, the RESTful server will return the just-created ID.

Clearness of this format is one of the reasons why in mORMot, we prefer to use JSON format instead of XML or any proprietary format.

Stateless

Every request should be an independent request so that we can scale up using load balancing techniques.

Independent request means with the data also send the state of the request so that the server can carry forward the same from that level to the next level.

↧

RESTful mORMot

January 10, 2014, 8:07 am

≫ Next: TDocVariant custom variant type

≪ Previous: REpresentational State Transfer (REST)

Our Synopse mORMot Framework was designed in accordance with Fielding's REST architectural style without using HTTP and without interacting with the World Wide Web.
Such Systems which follow REST principles are often referred to as "RESTful".

Optionally, the Framework is able to serve standard HTTP/1.1 pages over the Internet (by using the mORMotHttpClient / mORMotHttpServer units and the TSQLHttpServer and TSQLHttpClient classes), in an embedded low resource and fast HTTP server.

The standard RESTful methods are implemented, i.e. GET/PUT/POST/DELETE.

The following methods were added to the standard REST definition, for locking individual records and for handling database transactions (which speed up database process):

LOCK to lock a member of the collection;
UNLOCK to unlock a member of the collection;
BEGIN to initiate a transaction;
END to commit a transaction;
ABORT to rollback a transaction.

The GET method has an optional pagination feature, compatible with the YUI DataSource Request Syntax for data pagination - see TSQLRestServer.URI method and http://developer.yahoo.com/yui/datatable/#data . Of course, this breaks the "Every Resource is Identified by a Unique Identifier" RESTful principle - but it is much more easy to work with, e.g. to implement paging or custom filtering.

From the Delphi code point of view, a RESTful Client-Server architecture is implemented by inheriting some common methods and properties from a main class.

Then a full set of classes inherit from this TSQLRest abstract parent, e.g. TSQLRestClient TSQLRestClientURI TSQLRestServer.
This TSQLRest class implements therefore a common ancestor for both Client and Server classes.

BLOB fields

BLOB fields are defined as TSQLRawBlob published properties in the classes definition - which is an alias to the RawByteString type (defined in SynCommons.pas for Delphi up to 2007, since it appeared only with Delphi 2009). But their content is not included in standard RESTful methods of the framework, to spare network bandwidth.

The RESTful protocol allows BLOB to be retrieved (GET) or saved (PUT) via a specific URL, like:

 ModelRoot/TableName/TableID/BlobFieldName

This is even better than the standard JSON encoding, which works well but convert BLOB to/from hexadecimal values, therefore need twice the normal size of it. By using such dedicated URL, data can be transfered as full binary.

Some dedicated methods of the generic TSQLRest class handle BLOB fields: RetrieveBlob and UpdateBlob.

JSON representation

The "04 - HTTP Client-Server" sample application available in the framework source code tree can be used to show how the framework is AJAX-ready, and can be proudly compared to any other REST server (like CouchDB) also based on JSON.

First deactivates the authentication by changing the parameter from true to false in Unit2.pas:

 DB := TSQLRestServerDB.Create(Model,ChangeFileExt(paramstr(0),'.db3'),
 false);

and by commenting the following line in Project04Client.dpr:

  Form1.Database := TSQLHttpClient.Create(Server,'8080',Form1.Model);
  // TSQLHttpClient(Form1.Database).SetUser('User','synopse');
  Application.Run;

Then you can use your browser to test the JSON content:

Start the Project04Server.exe program: the background HTTP server, together with its SQLite3 database engine;
Start any Project04Client.exe instances, and add/find any entry, to populate the database a little;
Close the Project04Client.exe programs, if you want;
Open your browser, and type into the address bar:
```
  http://localhost:8080/root
```
OYou'll see an error message:
```
TSQLHttpServer Server Error 400
```

Type into the address bar:

  http://localhost:8080/root/SampleRecord

You'll see the result of all SampleRecord IDs, encoded as a JSON list, e.g.
```
 [{"ID":1},{"ID":2},{"ID":3},{"ID":4}]
```

Type into the address bar:

  http://localhost:8080/root/SampleRecord/1

You'll see the content of the SampleRecord of ID=1, encoded as JSON, e.g.

{"ID":1,"Time":"2010-02-08T11:07:09","Name":"AB","Question":"To be or not to be"}

Type into the address bar any other REST command, and the database will reply to your request...

You have got a full HTTP/SQLite3 RESTful JSON server in less than 400 KB.

Note that Internet Explorer or old versions of FireFox do not recognize the application/json; charset=UTF-8 content type to be viewed internally. This is a limitation of those softwares, so above requests will download the content as .json files, but won't prevent AJAX requests to work as expected.

Stateless ORM

Our framework is implementing REST as a stateless protocol, just as the HTTP/1.1 protocol it could use as its communication layer.

A stateless server is a server that treats each request as an independent transaction that is unrelated to any previous request.

At first, you could find it a bit disappointing from a classic Client-Server approach. In a stateless world, you are never sure that your Client data is up-to-date. The only place where the data is safe is the server. In the web world, it's not confusing. But if you are coming from a rich Client background, this may concern you: you should have the habit of writing some synchronization code from the server to replicate all changes to all its clients. This is not necessary in a stateless architecture any more.

The main rule of this architecture is to ensure that the Server is the only reference, and that the Client is able to retrieve any pending update from the Server side. That is, always modify a record content on a server side, then refresh the client to retrieve the modified value. Do not modify the client side directly, but always pass through the Server. The UI components of the framework follow these principles. Client-side modification could be performed, but must be made in a separated autonomous table/database. This will avoid any synchronization problem in case of concurrent client modification.

A stateless design is also pretty convenient when working with complex solutions.
Even Domain-Driven Design tends to restrain state to its smallest extend possible, since state introduces complexity.

↧

TDocVariant custom variant type

February 25, 2014, 9:41 am

≫ Next: Think free as in free speech, not free beer

≪ Previous: RESTful mORMot

With revision 1.18 of the framework, we just introduced two new custom types of variants:

TDocVariant kind of variant;
TBSONVariant kind of variant.

The second custom type (which handles MongoDB-specific extensions - like ObjectID or other specific types like dates or binary) will be presented later, when dealing with MongoDB support in mORMot, together with the BSON kind of content. BSON / MongoDB support is implemented in the SynMongoDB.pas unit.

We will now focus on TDocVariant itself, which is a generic container of JSON-like objects or arrays.
This custom variant type is implemented in SynCommons.pas unit, so is ready to be used everywhere in your code, even without any link to the mORMot ORM kernel, or MongoDB.

TDocVariant documents

TDocVariant implements a custom variant type which can be used to store any JSON/BSON document-based content, i.e. either:

Name/value pairs, for object-oriented documents;
An array of values (including nested documents), for array-oriented documents;
Any combination of the two, by nesting TDocVariant instances.

Here are the main features of this custom variant type:

DOM approach of any object or array documents;
Perfect storage for dynamic value-objects content, with a schema-less approach (as you may be used to in scripting languages like Python or JavaScript);
Allow nested documents, with no depth limitation but the available memory;
Assignment can be either per-value (default, safest but slower when containing a lot of nested data), or per-reference (immediate reference-counted assignment);
Very fast JSON serialization / un-serialization with support of MongoDB-like extended syntax;
Access to properties in code, via late-binding (including almost no speed penalty due to our VCL hack as already detailed);
Direct access to the internal variant names and values arrays from code, by trans-typing into a TDocVariantData record;
Instance life-time is managed by the compiler (like any other variant type), without the need to use interfaces or explicit try..finally blocks;
Optimized to use as little memory and CPU resource as possible (in contrast to most other libraries, it does not allocate one class instance per node, but rely on pre-allocated arrays);
Opened to extension of any content storage - for instance, it will perfectly integrate with BSON serialization and custom MongoDB types (ObjectID, RegEx...), to be used in conjunction with MongoDB servers;
Perfectly integrated with our Dynamic array wrapper and its JSON serialization as with the record serialization;
Designed to work with our mORMot ORM: any TSQLRecord instance containing such variant custom types as published properties will be recognized by the ORM core, and work as expected with any database back-end (storing the content as JSON in a TEXT column);
Designed to work with our mORMot SOA: any interface-based service is able to consume or publish such kind of content, as variant kind of parameters;
Fully integrated with the Delphi IDE: any variant instance will be displayed as JSON in the IDE debugger, making it very convenient to work with.

To create instances of such variant, you can use some easy-to-remember functions:

_Obj() _ObjFast() global functions to create a variantobject document;
_Arr() _ArrFast() global functions to create a variantarray document;
_Json() _JsonFast() _JsonFmt() _JsonFastFmt() global functions to create any variantobject or array document from JSON, supplied either with standard or MongoDB-extended syntax.

Variant object documents

With _Obj(), an objectvariant instance will be initialized with data supplied two by two, as Name,Value pairs, e.g.

var V1,V2: variant; // stored as any variant
 ...
  V1 := _Obj(['name','John','year',1972]);
  V2 := _Obj(['name','John','doc',_Obj(['one',1,'two',2.5])]); // with nested objects

Then you can convert those objects into JSON, by two means:

Using the VariantSaveJson() function, which return directly one UTF-8 content;
Or by trans-typing the variant instance into a string (this will be slower, but is possible).

 writeln(VariantSaveJson(V1)); // explicit conversion into RawUTF8
 writeln(V1);                  // implicit conversion from variant into string// both commands will write '{"name":"john","year":1982}'
 writeln(VariantSaveJson(V2)); // explicit conversion into RawUTF8
 writeln(V2);                  // implicit conversion from variant into string// both commands will write '{"name":"john","doc":{"one":1,"two":2.5}}'

As a consequence, the Delphi IDE debugger is able to display such variant values as their JSON representation.
That is, V1 will be displayed as '"name":"john","year":1982' in the IDE debugger Watch List window, or in the Evaluate/Modify (F7) expression tool.
This is pretty convenient, and much more user friendly than any class-based solution (which requires the installation of a specific design-time package in the IDE).

You can access to the object properties via late-binding, with any depth of nesting objects, in your code:

 writeln('name=',V1.name,' year=',V1.year);
 // will write 'name=John year=1972'
 writeln('name=',V2.name,' doc.one=',V2.doc.one,' doc.two=',doc.two);
 // will write 'name=John doc.one=1 doc.two=2.5
 V1.name := 'Mark';       // overwrite a property value
 writeln(V1.name);        // will write 'Mark'
 V1.age := 12;            // add a property to the object
 writeln(V1.age);         // will write '12'

Note that the property names will be evaluated at runtime only, not at compile time.
For instance, if you write V1.nome instead of V1.name, there will be no error at compilation, but an EDocVariant exception will be raised at execution (unless you set the dvoReturnNullForUnknownProperty option to _Obj/_Arr/_Json/_JsonFmt which will return a null variant for such undefined properties).

In addition to the property names, some pseudo-methods are available for such objectvariant instances:

  writeln(V1._Count); // will write 3 i.e. the number of name/value pairs in the object document
  writeln(V1._Kind);  // will write 1 i.e. ord(sdkObject)for i := 0 to V2._Count-1 do
    writeln(V2.Name(i),'=',V2.Value(i));
  // will write in the console://  name=John//  doc={"one":1,"two":2.5}//  age=12if V1.Exists('year') then
    writeln(V1.year);

You may also trans-type your variant instance into a TDocVariantData record, and access directly to its internals.
For instance:

 TDocVariantData(V1).AddValue('comment','Nice guy');
 with TDocVariantData(V1) do// direct transtypingif Kind=sdkObject then// direct access to the TDocVariantDataKind fieldfor i := 0 to Count-1 do// direct access to the Count: integer field
     writeln(Names[i],'=',Values[i]);    // direct access to the internal storage arrays

By definition, trans-typing via a TDocVariantData record is slightly faster than using late-binding.
But you must ensure that the variant instance is really a TDocVariant kind of data before transtyping e.g. by calling DocVariantType.IsOfType(aVariant).

Variant array documents

With _Arr(), an arrayvariant instance will be initialized with data supplied as a list of Value1,Value2,..., e.g.

var V1,V2: variant; // stored as any variant
 ...
  V1 := _Arr(['John','Mark','Luke']);
  V2 := _Obj(['name','John','array',_Arr(['one','two',2.5])]); // as nested array

Then you can convert those objects into JSON, by two means:

Using the VariantSaveJson() function, which return directly one UTF-8 content;
Or by trans-typing the variant instance into a string (this will be slower, but is possible).

 writeln(VariantSaveJson(V1));
 writeln(V1);  // implicit conversion from variant into string// both commands will write '["John","Mark","Luke"]'
 writeln(VariantSaveJson(V2));
 writeln(V2);  // implicit conversion from variant into string// both commands will write '{"name":"john","array":["one","two",2.5]}'

As a with any object document, the Delphi IDE debugger is able to display such arrayvariant values as their JSON representation.

Late-binding is also available, with a special set of pseudo-methods:

  writeln(V1._Count); // will write 3 i.e. the number of items in the array document
  writeln(V1._Kind);  // will write 2 i.e. ord(sdkArray)for i := 0 to V1._Count-1 do
    writeln(V1.Value(i),':',V2._(i));
  // will write in the console://  John John//  Mark Mark//  Luke Lukeif V1.Exists('John') then
    writeln('John found in array');

Of course, trans-typing into a TDocVariantData record is possible, and will be slightly faster than using late-binding.

Create variant object or array documents from JSON

With _Json() or _JsonFmt(), either a document or arrayvariant instance will be initialized with data supplied as JSON, e.g.

var V1,V2,V3,V4: variant; // stored as any variant
 ...
  V1 := _Json('{"name":"john","year":1982}'); // strict JSON syntax
  V2 := _Json('{name:"john",year:1982}');     // with MongoDB extended syntax for names
  V3 := _Json('{"name":?,"year":?}',[],['john',1982]);
  V4 := _JsonFmt('{%:?,%:?}',['name','year'],['john',1982]);
  writeln(VariantSaveJSON(V1));
  writeln(VariantSaveJSON(V2));
  writeln(VariantSaveJSON(V3));
  // all commands will write '{"name":"john","year":1982}'

Of course, you can nest objects or arrays as parameters to the _JsonFmt() function.

The supplied JSON can be either in strict JSON syntax, or with the MongoDB extended syntax, i.e. with unquoted property names.
It could be pretty convenient and also less error-prone when typing in the Delphi code to forget about quotes around the property names of your JSON.

Note that TDocVariant implements an open interface for adding any custom extensions to JSON: for instance, if the SynMongoDB.pas unit is defined in your application, you will be able to create any MongoDB specific types in your JSON, like ObjectID(), new Date() or even /regex/option.

As a with any object or array document, the Delphi IDE debugger is able to display such variant values as their JSON representation.

Per-value or per-reference

By default, the variant instance created by _Obj() _Arr() _Json() _JsonFmt() will use a copy-by-value pattern.
It means that when an instance is affected to another variable, a new variant document will be created, and all internal values will be copied. Just like a record type.

This will imply that if you modify any item of the copied variable, it won't change the original variable:

var V1,V2: variant;
 ...
 V1 := _Obj(['name','John','year',1972]);
 V2 := V1;                // create a new variant, and copy all values
 V2.name := 'James';      // modifies V2.name, but not V1.name
 writeln(V1.name,' and ',V2.name);
 // will write 'John and James'

As a result, your code will be perfectly safe to work with, since V1 and V2 will be uncoupled.

But one drawback is that passing such a value may be pretty slow, for instance, when you nest objects:

var V1,V2: variant;
 ...
 V1 := _Obj(['name','John','year',1972]);
 V2 := _Arr(['John','Mark','Luke']);
 V1.names := V2; // here the whole V2 array will be re-allocated into V1.names

Such a behavior could be pretty time and resource consuming, in case of a huge document.

All _Obj() _Arr() _Json() _JsonFmt() functions have an optional TDocVariantOptions parameter, which allows to change the behavior of the created TDocVariant instance, especially setting dvoValueCopiedByReference.

This particular option will set the copy-by-reference pattern:

var V1,V2: variant;
 ...
 V1 := _Obj(['name','John','year',1972],[dvoValueCopiedByReference]);
 V2 := V1;             // creates a reference to the V1 instance
 V2.name := 'James';   // modifies V2.name, but also V1.name
 writeln(V1.name,' and ',V2.name);
 // will write 'James and James'

You may think this behavior is somewhat weird for a variant type. But if you forget about per-value objects and consider those TDocVariant types as a Delphi class instance (which is a per-reference type), without the need of having a fixed schema nor handling manually the memory, it will probably start to make sense.

Note that a set of global functions have been defined, which allows direct creation of documents with per-reference instance lifetime, named _ObjFast() _ArrFast() _JsonFast() _JsonFmtFast().
Those are just wrappers around the corresponding _Obj() _Arr() _Json() _JsonFmt() functions, with the following JSON_OPTIONS[true] constant passed as options parameter:

const/// some convenient TDocVariant options// - JSON_OPTIONS[false] is _Json() and _JsonFmt() functions default// - JSON_OPTIONS[true] are used by _JsonFast() and _JsonFastFmt() functions
  JSON_OPTIONS: array[Boolean] of TDocVariantOptions = (
    [dvoReturnNullForUnknownProperty],
    [dvoReturnNullForUnknownProperty,dvoValueCopiedByReference]);

When working with complex documents, e.g. with BSON / MongoDB documents, almost all content will be created in "fast" per-reference mode.

Advanced TDocVariant process

Object or array document creation options

As stated above, a TDocVariantOptions parameter enables to define the behavior of a TDocVariant custom type for a given instance.
Please refer to the documentation of this set of options to find out the available settings. Some are related to the memory model, other to case-sensitivity of the property names, other to the behavior expected in case of non-existing property, and so on...

Note that this setting is local to the given variant instance.

In fact, TDocVariant does not force you to stick to one memory model nor a set of global options, but you can use the best pattern depending on your exact process.
You can even mix the options - i.e. including some objects as properties in an object created with other options - but in this case, the initial options of the nested object will remain. So you should better use this feature with caution.

You can use the _Unique() global function to force a variant instance to have an unique set of options, and all nested documents to become by-value, or _UniqueFast() for all nested documents to become by-reference.

// assuming V1='{"name":"James","year":1972}' created by-reference
  _Unique(V1);             // change options of V1 to be by-value
  V2 := V1;                // creates a full copy of the V1 instance
  V2.name := 'John';       // modifies V2.name, but not V1.name
  writeln(V1.name);        // write 'James'
  writeln(V2.name);        // write 'John'
  V1 := _Arr(['root',V2]); // created as by-value by default, as V2 was
  writeln(V1._Count);      // write 2
  _UniqueFast(V1);         // change options of V1 to be by-reference
  V2 := V1;
  V1._(1).name := 'Jim';
  writeln(V1);
  writeln(V2);
  // both commands will write '["root",{"name":"Jim","year":1972}]'

The easiest is to stick to one set of options in your code, i.e.:

Either using the _*() global functions if your business code does send some TDocVariant instances to any other part of your logic, for further storage: in this case, the by-value pattern does make sense;
Or using the _*Fast() global functions if the TDocVariant instances are local to a small part of your code, e.g. used as schema-less Data Transfer Objects (DTO).

In all cases, be aware that, like any class type, the const, var and out specifiers of method parameters does not behave to the TDocVariant value, but to its reference.

Integration with other mORMot units

In fact, whenever a schema-less storage structure is needed, you may use a TDocVariant instance instead of class or record strong-typed types:

Client-Server ORM will support TDocVariant in any of the TSQLRecord variant published properties;
Interface-based services will support TDocVariant as variant parameters of any method, which make them as perfect DTO;
Since JSON support is implemented with any TDocVariant value from the ground up, it makes a perfect fit for working with AJAX clients, in a script-like approach;
If you use our SynMongoDB.pas unit to access a MongoDB server, TDocVariant will be the native storage to create or access BSON arrays or objects documents;
Cross-cutting features (like logging or record / dynamic array enhancements) will also benefit from this TDocVariant custom type.

We are pretty convinced that when you will start playing with TDocVariant, you won't be able to live without it any more.
It introduces the full power of late-binding and schema-less patterns to your application, which can be pretty useful for prototyping or in Agile development.
You do not need to use scripting engines like Python or JavaScript to have this feature, if you need it.

Feedback and comments are welcome in our forum, as usual!

↧