How to readTXTfile bigger then 8 MB

I need to read text files which has size 30-100 MB. When I try to open such file, i see an Error message “java.lang.OutOfMemoryError”. Files smaller then 8 MB are readed without problem.

textData = plugins.file.readTXTFile( contents[p] );

Mac OS 10.3.9, 2 GB RAM, Java version 1.4.2_56
Servoy 2.2.4-build 336

Best Regards
AGIS

Are you actually surprised by this ?

I think the point about this topic is that there is no way in Servoy (as far as I know) to read a file in chunks… to make very large files manageable.

In my case, I transfer data from FileMaker as merge files, read them in Servoy using the file plugin, process data and insert the results into PostgreSQL using the RawSQL plugin. I get very good performance, but what will happen if I have to import a 50MB file?

You could also just give Servoy more memory. Just add this line to your servoy_developer.bat:

java -Xmx512m -classpath servoy_developer.jar Servoy

instead of

java -classpath servoy_developer.jar Servoy

This will allow Servoy developer to access up to 512MB RAM.

patrick, that’s indeed one of the solutions, but it would be nice, if we could read, or import text-files in chuncks (or line by line)

Servoy, is that hard to implement?

I have made a feature request for this: http://forum.servoy.com/viewtopic.php?p=29625#29625

Christian said:

I think the point about this topic is that there is no way in Servoy (as far as I know) to read a file in chunks… to make very large files manageable.

The issue is that to hold onto a file and read from it ‘in chunks’ requires a dedicated reader i.e. a reader that knows how to read that file safely.
That means that somebody needs to write a stream reader for the content (if you don’t do it as a stream, you’ll always run out of memory at some point);
And that means somebody needs to write a UI controller that would allow the solution developer or the solution user to control that read process (and probably only in a forwrds direction).

What I am saying is that this is a lot of work, with in general, minimal return. You won’t see many such implementations in the general software community. For example, neither Microsoft nor Sun have such a generic reader utility (at least one publicly available). The reason is simple : you need to have a defined structure in your data, say something which conforms to a schema. You can’t reliably read a stream without knowing what can come down the wire basically.

So, there is an option if you have a file whose input stream tructure can be anticipated. That’s where SAX and schemas come into the equation. So, this TXT file, if it’s really useful and provides structured data, could be read by any sax parser and a generic sax handler implementation could be put in place with a generic ‘walker’. Such implementations do exist and can also be readily written.

So, my parting question is “who will use it and how (and why ?) ?”.

What is in this large text file that you need to be able to access the content in its entirety at any moment in time ?

That should be easily doable in a plugin, I think… So yes, the fileplugin could do that.

SalernoJ:
So, my parting question is “who will use it and how (and why ?) ?”.

I have to keep Servoy systems reasonably synched with FileMaker databases (until we can get completely rid of FileMaker).
I use Servoy to export merge files from FileMaker, then I read the file in Servoy, filter the data, feed it to the raw-sql plugin to import into postgres.
I can import up to 50,000 records a minute :slight_smile: The process is almost automatic, and works fine because none of the tables have more than 100,000 records.

I do have another FM system with 3,000,000 records in it. While I don’t have requirement to process that database, I know my process would fail.

Reading text files and importing record data…different spaces I would have thought.

I wouldn’t employ “plugins.file.readTXTFile” as part of a record import process.
Record import is a critical step. So, knowing the constraints of any in-memory file cache, it’s too risky.

You would be better off building a merge file with an xml output format (I think FMP can do that for you easily) and not need to use the text reader. That would be much more efficient.

Slighly off topic : when do you warehouse your data ? 3 million records is a lot of live records, even for a small bank ! I would consider a mechanism for archival based on business date and state requirements.

I have to make one point clear: I’m not only importing data; I’m

  • reformatting dates
  • reformatting numbers
  • checking lengths of strings
  • changing pks from alphanumeric strings into integers
  • changing all fks based on the changes to the pks and verifying them since the new database has many contraints to assure data integrity.
  • splitting FM tables into multiple SQL tables where it makes sense.

Servoy does all this work for me as long as I don’t have to process a file > 8MB. That’s my point.

I wrote something in RealBasic when I thought I had to process those 55 tables containing a total of 3,000,000 records, but my Servoy Solution is better and faster. It is just an extra module which I can remove from my projects when it no longer needed.

What we do in a case like that is to import the text file into a table and use Servoy to process the data and move it to the actual target table. Maybe that is worth a thought…

That does not change anything.
eitherway you have to load the text-file into memory first!

patrick:
What we do in a case like that is to import the text file into a table and use Servoy to process the data and move it to the actual target table. Maybe that is worth a thought…

The difference is that I use Servoy to create the temporary table on the fly based on the headers I find in the merge file :slight_smile:

HJK:
That does not change anything.
eitherway you have to load the text-file into memory first!

Most databases have a tool that will import whatever text file in seconds. I can easily import a 250 MB text file into SQL Server for example without any problems. You don’t have to use Servoy for everything :)

patrick:

HJK:
That does not change anything.
eitherway you have to load the text-file into memory first!

Most databases have a tool that will import whatever text file in seconds. I can easily import a 250 MB text file into SQL Server for example without any problems. You don’t have to use Servoy for everything :slight_smile:

Sure. You may have better import tools than I have. Aqua Data Studio/PG Admin III/DBViz… none of them remember to align source and destiantion columns properly. That is why I prefer to push one button in Servoy and have it all done for me.

I read files line-by-line into Hypercard back in 1991…