Dear list,
I am trying to give a small company access to their contact database via webinterface. Long story short, 'carddavmate' was the only webclient I could convince to even show the addressbook (even though its not parsing the 'n' attribute correctly, apparently because of too many trailing ';'-characters in the field). But here's the problem: its a tiny server and a rather large addressbook (~4000 contacts, 7,5MB XML data) and it takes a long time to initially load the resource - once processed it works fine. To load it in the first place I had to increase 'max_execution_time' in php.ini for DAViCal and 'timeOut' in carddavmate (both latest versions). But even then its fetching the complete addressbook which takes a long time and then jquery is parsing it, which takes another 30 seconds or so and meanwhile renders the complete browser useless (most of the time it apparently spends in the 'insertContact' function (can't reproduce this now but earlier I got 'vcardUnescapeValue' as the most time consuming function which makes more sense and seems to be the more accurate result)). Every other client I tried just timed out.
The server has a 1,2GHz Dual Core ARM processor and 2GB of RAM. It takes approximately 40s to prepare the data - during this period the 'apache -k start' process that serves the REPORT request consumes one core's complete CPU on the server (no visible load from PostgreSQL..?) - then another 10s to send it and then another 30s for parsing it in the browser (see http://gorilla-computing.de/misc/carddavmate.png for a graphical impression). Is that to be expected? Its not that much data after all (even though its compressed and encrypted and these aren't literally strengths of the ARM). Or must it be, somehow, corrupted data that makes the queries / the parsing take so long? Its a rather ugly outlook import and a lot of invalid stuff has accumulated in several fields (especially the note field is a constant nightmare).
On the client side: could it be the unnecessary semicolons and the note fields that keep 'vcardUnescapeValue' so terribly busy? And if yes, is there an easy way of changing all contacts directly via psql? Or maybe first: are there any drawbacks in modifying faulty fields directly in the database?
And finally, I know caldavzap supports "interval synchronization" for querying only the relevant time frame of information from the server and indeed calendar sync works fine even though the calendar database is even bigger. Is there anything like that for carddav queries? Like, only import the first hundred contacts and when the user scrolls down or searches a contact parse the next fifty and so on?
Thanks a lot and sorry for the essay!
Best, Paul
Hi Paul,
On 12 Jan 2015, at 17:50, Paul Kallnbach - Gorilla Computing p.kallnbach@gorilla-computing.de wrote:
Dear list,
I am trying to give a small company access to their contact database via webinterface. Long story short, 'carddavmate' was the only webclient I could convince to even show the addressbook (even though its not parsing the 'n' attribute correctly, apparently because of too many trailing ';'-characters in the field).
this may be true, I expect valid data because creating an "infinite list of workaround" for invalid data consumes a ton of CPU/human resources
But here's the problem: its a tiny server and a rather large addressbook (~4000 contacts, 7,5MB XML data) and it takes a long time to initially load the resource - once processed it works fine. To load it in the first place I had to increase 'max_execution_time' in php.ini for DAViCal and 'timeOut' in carddavmate (both latest versions). But even then its fetching the complete addressbook which takes a long time and then jquery is parsing it, which takes another 30 seconds or so and meanwhile renders the complete browser useless (most of the time it apparently spends in the 'insertContact' function (can't reproduce this now but earlier I got 'vcardUnescapeValue' as the most time consuming function which makes more sense and seems to be the more accurate result)). Every other client I tried just timed out.
4000 contacts in one resource is maybe too much. We have 14869 contacts divided into 87 collections and nobody loads them all at once (our users use 1-10 collections - only these are loaded into the client).
The server has a 1,2GHz Dual Core ARM processor and 2GB of RAM. It takes approximately 40s to prepare the data - during this period the 'apache -k start' process that serves the REPORT request consumes one core's complete CPU on the server (no visible load from PostgreSQL..?) - then another 10s to send it and then another 30s for parsing it in the browser (see http://gorilla-computing.de/misc/carddavmate.png for a graphical impression). Is that to be expected? Its not that much data after all (even though its compressed and encrypted and these aren't literally strengths of the ARM).
Selecting CardDAV data from DAViCal is very fast because the initial synchronization is something like: SELECT caldav_data FROM caldav_data WHERE .... without any complicated joins or anything similar. Your problem is very probably the compression (encryption consumes much less CPU than the compression) and/or wrong PHP/Apache settings (e.g.: both PHP and also Apache compressions are enabled, too high compression level, incorrect memory settings in PHP, ...).
Or must it be, somehow, corrupted data that makes the queries / the parsing take so long? Its a rather ugly outlook import and a lot of invalid stuff has accumulated in several fields (especially the note field is a constant nightmare).
Corrupted data may (and very probably will) affect CardDavMATE performance but NOT DAViCal performance (there is no vCard data processing when it returns CardDAV data).
On the client side: could it be the unnecessary semicolons and the note fields that keep 'vcardUnescapeValue' so terribly busy? And if yes, is there an easy way of changing all contacts directly via psql? Or maybe first: are there any drawbacks in modifying faulty fields directly in the database?
Fixing invalid (non RFC compliant) data is MUST, otherwise CardDavMATE will not show you invalid entries. You can fix them directly in the database (UPDATE caldav_data SET caldav_data=E'VCARD\r\n...\r\nEND:VCARD\r\n' WHERE dav_name=...), but if you do this then clients which use sync-collection REPORT will never notice that change (you need to restart the sync in all existing clients). If restart (re-sync) of all clients is not possible in your environment then you need to perform the "synchronization" by calling (after each update in the caldav_data table):
SELECT write_sync_change((SELECT collection_id FROM collection WHERE dav_name=regexp_replace(dav_name, '[^/]+$', '')), 200, dav_name);
Also please try the latest CardDavMATE 0.12rc - there are a ton of performance improvements ...
And finally, I know caldavzap supports "interval synchronization" for querying only the relevant time frame of information from the server and indeed calendar sync works fine even though the calendar database is even bigger. Is there anything like that for carddav queries? Like, only import the first hundred contacts and when the user scrolls down or searches a contact parse the next fifty and so on?
Yes, this is possible because the client first downloads the list of contacts and then makes a request to download the contact itself. CardDavMATE downloads all contacts from the list in one request because using the list of URLs it is not possible to sort data, so downloading 100 contacts will result 100 random contacts => unusable. CardDAV also defines something like "search" but the problem is that most of servers don't support search, or only the subset of vCard files are searchable => unusable for real environment.
JM
Thanks a lot and sorry for the essay!
Best, Paul