Blobs or Filesystem for images

Fellow Servoyers

I have been asked to reproduce an existing filemaker solution in Servoy to solve performance / stability / futureproofing / licensing costs issues. I have to put thumbnail images into servoy from what could be large image files (probably all JPEGs). I have to make the decision about whether to let Servoy manage images in blobs, or let it manage them via the filesystem of the Server.

Everything I can see suggests I should do this via the filesystem and I am almost 100% convinced that that is the way I’ll go. JA however suggested that I should put the images into the database

My question is, which is the better way? and why? and if it were to be blobs, which is the best database?. If I choose to do it the way I’m inclined to, is there a way to get Servoy to manage the images in the filesystem of the server without needing to give access to the share to servoy clients?

Thanks for any help you can give or light you can shed…

I vote for the jaleman solution. I use it in a couple of solutions that are already in production (using sybase-ianywhere) with no problems, and without any worries about sharing folders, directories backup, filename limitation, etc, etc

Thanks Enrico

My reason for wanting to go the other way is that having a dependency on the database to keep its-self in check is something that I’ll never get away from. If something happens to the database, or if we end up with 40,000 x 5MB image files (and the database swells to 200GB) and THEN it corrupts or slows down, or something, I’d hate to be in the middle of it. If however the database stays lean and mean and the pictures are referenced from inside it, they can be incrementally backed up, the database can easily and quickly be backed up and it is essentially infinitely expandable. My reason for being wary of it is that in most cases that I see in commercial software, storing files in the file system seems to avoid problems (itunes, iphoto and eudora all store files in the filesystem and work fine, Entourage and Mail store attachments in a messages ‘file’ and I have seen awful problems with corruption that just doesn’t happen if the database lets the filesystem manage files). I think it would be easier to put them into blobs, but am still torn between which is the BETTER way.

Thunder:
If something happens to the database, or if we end up with 40,000 x 5MB image files (and the database swells to 200GB) and THEN it corrupts or slows down, or something, I’d hate to be in the middle of it.

also would I if something happens to my FILESYSTEM with 40,000 x 5Mb image files in it so…

I really feel more confortable knowing that my images are managed by a good database engine and not just written on a shared file system exposed to stupid users / viruses / wrong commands / bad luck / renames / etc …

I completely agree

BUT…

a file system full of 200GB worth of images can be incrementally backed up (today we’ll back up the 15 images that were added today, tomorrow we’ll back up the 20 images that’ll be added tomorrow etc.) How does one back up a 200GB database that is changing all the time? I know that Sybase can back its-self up, but it would be tricky to do that to tape with 200GB per day I think..

Also, surely Servoy Server can manage access to the directory containing images without having to give folder access to users? and hacks / viruses etc. could equally affect a machine with servoy server running on it (although this will be running on a Mac server ;).

Still trying to convince myself that there is a benefit to using blobs..

Think you have to check out backup options for the different Db’s around. Maybe make your DB choice based on your requirements.

We work mainly on Oracle and that supports incremental backups. dunno if it works exactly like you say, but, our 100+ Gb DB also gets backed up properly. Other DB Systems will probably have tackled this as well.

OK, if the total DB would crash, you’ll need quite some recovery time, but, the same goes for a file system.

Just my 5 cents…

Paul

sybase-anywhere (and i belive a lot of other serious db) has incremental backup..

and also has live backup (from anywhere manual: to provide a redundant copy of the transaction log that is available for restart of your system on a secondary machine in case the machine running the database server becomes unusable)

and also has replication server (from anywhere manual: Data replication is the sharing of data among physically distinct databases. When an application modifies shared data at any one database, the changes are propagated to the other databases in the replication setup. Changes can be propagated by various means and through a variety of channels, allowing flexible replication setups while preserving data integrity. Data replication is also referred to as data synchronization.)

Don’t forget this: a filesystem is a database for files!
Why a filesystem exists? To store files…

So from an abstract point of view the preferred database to store files
is a filesystem.

This doesn’t mean you cannot store files in another database without
success. But in my opinion if the number (and possibly size) of files is
vey large, why not to use the native and best performant database
born to accomplish this task?

I really feel more confortable knowing that my images are managed by a good database engine and not just written on a shared file system exposed to stupid users / viruses / wrong commands / bad luck / renames / etc …

But every modern filesystem can manage permissions, users etc…
You could give the servoy user exclusive access to the part of filesystem
where you store the files…

thanks for all the replies. I am leaning more and more toward using files in the filesystem. Since the database its-self lives in the filesystem, and is not going to survive a filesystem corruption / hack / virus / stupid user etc. at least if my database without blobs goes wrong, I still have the pictures and can recover the database from a backup. IF it stores the pictures as blobs (which again is just as susceptible to file system problems as separate files would be), and something does go wrong, I lose the lot or face an enormous rebuild.

Getting less and less convincing.

a.mariottini:
Don’t forget this: a filesystem is a database for files!
Why a filesystem exists? To store files…

So from an abstract point of view the preferred database to store files
is a filesystem.

This doesn’t mean you cannot store files in another database without
success. But in my opinion if the number (and possibly size) of files is
vey large, why not to use the native and best performant database
born to accomplish this task?

But your point can be easily reversed: the preferred filesystem could be a database.
I read somewhere that Longhorn will replace the current filesystem with MSServer and, If I remember well, BeOS uses a database as filesystem.

So, the question is not so clear, I’m afraid…:slight_smile:

Anyway… with Servoy you can do it as you prefer!!! and that is nice and ;) democratic ;).. I’ll stay on DB…

ciao

Both Apple and Microsoft are incorporating relational database management systems in their next versions of operating systems (yes, apple will probably release it a tad earlier than MS) to manage files. Good news is that Servoy will happily support your nineties vision on storing files in a filesystem.

But your point can be easily reversed: the preferred filesystem could be a database…

Yes this is true (I don’t about MSServer as a filesystem etc…).
My point of view is simply this: at the moment the best database for storing files is the filesystem. This doesn’t mean this is true for the future. And of course you could use an SQL database to store files, but SQL databases are not born to do this, and as I know they are not optimized for this. Filesystem is indeed optimized for file management.

I hope in the future this could change and we could have a single thing to store files and other structured data.

a.mariottini:
I hope in the future this could change and we could have a single thing to store files and other structured data.

That’s already possible. Oracle for example introduced this 6 years ago: http://www.orafaq.com/faqifs.htm

Interesting discussion…

Just a small point about storing media files as BLOBs…
We have been trialling with PDF’s and with these the problem is that the blob size on SERVOY is much bigger than the original pdf (2mb vs 600kb or so, depending what’s in the pdf)…this might be a case for using the filesystem perhaps…particularly if a fast connection is not available to the user.

But as you just have JPGs, in your case maybe this is not an issue…I would then probably lean towards SERVOY rather than the filesystem.

What if you have a mixture of media types ?

Regards,

Guy

guydoms:
the problem is that the blob size on SERVOY is much bigger than the original pdf (2mb vs 600kb or so, depending what’s in the pdf)…

This reminded me of a solution which we built for a customer which already does store files as blobs. The solution is supposed to store word files exclusively, but during building it, I was sticking any old file in there as a test. I have just had a look at it and retrieved the 3 attached files that are currently in it. They are 2 jpegs and 1 tiff totalling 6MB.

The sybase .db file for that solution is 45.9MB!!! For reference point, I have 10 x .db files in the Servoy database folder and the next biggest one - the servoy repository - is only 7.8MB, out of the other .db files that I have built, the biggest (a much more extensive solution than the 45MB solution) is 3.8MB and they from there go down to about 1.5MB.

It seems that Servoy does swell far beyond what the files being put in are (I could have done this more scientifically by exporting the solution and importing as fresh empty version and starting from scratch adding files) it does seem however that there is some kind of vacuum in Servoy which adds stuff over and above the blobs that should actually be in there, or that they are not being removed by deleting records. As another test, I deleted the record that contained the biggest file (the 5.7MB .tiff)

This had 2 very surprising effects. Firstly, it took about 7 seconds to delete the record on my dual 2.0Ghz G5, and second - and most worrying, it increased the .db file to 89.3MB. I opened and closed Servoy a couple of times, same thing. I then shut down dbsrv9 as well and restarted servoy, this brought the size of .db back to 45M.9MB (did not reduce by the size of the file being removed either). Deleting one record containing one blob of 5.7MB doubled the size of the Sybase .db file until dbsrv9 was quit and restarted…

I remain entirely unconvinced. I think Andrea said it best, a filesystem is for files - especially when I don’t understand what Servoy (or in fact Sybase) is actually doing with these blobs…

Good point!

maybe Jan A. knows a Sybase-guru who can answer this behaviour of Sybase (or Servoy?)

I think the behavior of blobs in the DB is not something Servoy controls, but is DB specific.

As for a DB size not shrinking when removing a record: That is default DB behavior. space occupied once remains occupied by the DB, untill you rebuild the table.

As for the size temporarily increasing: Probably has to do with tracking, rollback segments, archive options etc.

BTW: on a filesystem, the space also remains “occupied” after deleting a file. Only because the filename entry is removed from the FAT, you cannot see it anymore. To really remove a file, you need to clean up your harddrive

5 more cents…

Paul

pbakker:
As for a DB size not shrinking when removing a record: That is default DB behavior. space occupied once remains occupied by the DB, untill you rebuild the table.

Hmmmmmm… I want to avoid the database needing maintenance. Even more reason I think to keep the database lean and use the filesystem for file storage…

pbakker:
BTW: on a filesystem, the space also remains “occupied” after deleting a file. Only because the filename entry is removed from the FAT, you cannot see it anymore. To really remove a file, you need to clean up your harddrive

Not on my harddrive it doesn’t… I remove a file and I get the space back - directly, immediately and observably. The fact that the actual data of the file is not actively erased (or written over) when I empty my trash is simply a case of unneccesarily wasting cycles (it is more work to actively write over the space a file occupies than not - which makes erased files retrievable). When deleting a file, the directory of my disk is told that the space previously occupied by the file is now available again. This has the effect of giving back bytes. At least on my Mac’s HFS+ disk it does. Not sure about NTFS or FAT/FAT32 but surely it is also just a directory issue…

Back on topic, if I am right about the file system giving back space for deleted files, it seems that this is another reason to not use Sybase for file storage. I don’t forsee that there will be a tremendous number of files being deleted, but if for example someone accidentally puts a 50MB file in the database rather than the 5MB version they wanted to, then deletes the 50MB version and adds the 5Mb version, we end up with an awful lot of extra, useless data which won’t disappear until the database is rebuilt… Not looking good..

Back on topic, if I am right about the file system giving back space for deleted files, it seems that this is another reason to not use Sybase for file storage. I don’t forsee that there will be a tremendous number of files being deleted, but if for example someone accidentally puts a 50MB file in the database rather than the 5MB version they wanted to, then deletes the 50MB version and adds the 5Mb version, we end up with an awful lot of extra, useless data which won’t disappear until the database is rebuilt… Not looking good..

This is not the way things go and this is not what Paul said.

The 50Mb are still in the database, but they are reused for
newer data. Exactly the same way the filesystems does.
Simply the database subtract space to the filesystem, but
internally the database has 50Mb of free space to store data.
From the point of view of filesystem this is used space, from
the point of view of the database not.

But this is a sleaziness…