Rob Van Dyck
2015-06-26 12:44:49 UTC
Hi,
I work for a small company using (the latest stable) H2 in our software.
Our client base is starting to grow (+-100 installations on client
computers, most have DB's multiple GBs in size) and we are starting to run
into more problems with broken (and sometimes worse: unrepairable) H2 DBs.
Our clients use lots of different OSes (all Windows/Mac OS X) on normal
commodity hardware. To give you an estimate about the failure rate: we have
had about 10 broken DBs in the last 6 months.
We currently use an embedded persistent database with default connection
properties: "jdbc:h2:file:" + h2Path + ";IFEXISTS=TRUE" after which we set
autocommit to false. There is only one thread connected with the DB and the
database was created using the latest version stable H2 version.
We know for sure a few instances happened a limited time after our software
ran into an out-of-memory situation. We also suspect some happened after an
OS-level crash which caused the computer to reboot without having a chance
to shutdown properly (e.g., power failure or the user pressing the reset
button).
The data is privacy sensitive, so we are reluctant to provide it to you
unless that is the only option.
We were hoping you might be able to hint us a little bit on what we might
do to avoid these issues?
1. We are converting our embedded persistent H2 DB to a (tcp)server started
by a different process. Hoping that OOMs in our software won't make H2
corrupt since the H2 process can shutdown cleanly. Do you think this might
help for OOMs?
2. We are wondering whether we are missing certain properties to set on the
connection? We looked at UNDO_LOG and LOG, but the default settings are
already the 'safest'.
3. We are using the latest stable version 1.3.176 (and use the default of
its 'storage engines' called B-tree (?). I.e., we don't use MVStore).
Should we consider moving to the beta version? Could that possibly have
more protection against these types of failure?
4. We know some instances of corruption happened in a virtualized
environment (where the guest OS 'crashed'). We tried to reproduce this by
running a Windows 8 guest on a Linux host, where we tried to reset and
shutdown our application multiple times (10) while it was performing heavy
database updates. We could not reproduce the issue.
5. One of the issues is that we cannot reliably detect issues. At one time
we ran the H2 recovery tool which gave us no errors so we continued using
the existing DB, but immediately afterwards this resulted in H2 complaining
about corruption. Is this possible (does the recovery tool check all kind
of errors? Or does it skip, e.g., index pages)? Is there a way to know for
sure that there is no corruption?
6. We have tried on some occasions to run the recovery tool and re-import
the corrupted database, but at least on one occasion this gave us errors so
we were unable to restore the data. Unfortunately we do not have the error
output anymore.
7. The next time this happens, is there anything that we should check
(e.g., the trace file)?
I'll include some of the stacktraces, maybe this can give you an indication
of what might have gone wrong.
Thanx for your answers and/or tips.
Kind regards,
Rob.
I work for a small company using (the latest stable) H2 in our software.
Our client base is starting to grow (+-100 installations on client
computers, most have DB's multiple GBs in size) and we are starting to run
into more problems with broken (and sometimes worse: unrepairable) H2 DBs.
Our clients use lots of different OSes (all Windows/Mac OS X) on normal
commodity hardware. To give you an estimate about the failure rate: we have
had about 10 broken DBs in the last 6 months.
We currently use an embedded persistent database with default connection
properties: "jdbc:h2:file:" + h2Path + ";IFEXISTS=TRUE" after which we set
autocommit to false. There is only one thread connected with the DB and the
database was created using the latest version stable H2 version.
We know for sure a few instances happened a limited time after our software
ran into an out-of-memory situation. We also suspect some happened after an
OS-level crash which caused the computer to reboot without having a chance
to shutdown properly (e.g., power failure or the user pressing the reset
button).
The data is privacy sensitive, so we are reluctant to provide it to you
unless that is the only option.
We were hoping you might be able to hint us a little bit on what we might
do to avoid these issues?
1. We are converting our embedded persistent H2 DB to a (tcp)server started
by a different process. Hoping that OOMs in our software won't make H2
corrupt since the H2 process can shutdown cleanly. Do you think this might
help for OOMs?
2. We are wondering whether we are missing certain properties to set on the
connection? We looked at UNDO_LOG and LOG, but the default settings are
already the 'safest'.
3. We are using the latest stable version 1.3.176 (and use the default of
its 'storage engines' called B-tree (?). I.e., we don't use MVStore).
Should we consider moving to the beta version? Could that possibly have
more protection against these types of failure?
4. We know some instances of corruption happened in a virtualized
environment (where the guest OS 'crashed'). We tried to reproduce this by
running a Windows 8 guest on a Linux host, where we tried to reset and
shutdown our application multiple times (10) while it was performing heavy
database updates. We could not reproduce the issue.
5. One of the issues is that we cannot reliably detect issues. At one time
we ran the H2 recovery tool which gave us no errors so we continued using
the existing DB, but immediately afterwards this resulted in H2 complaining
about corruption. Is this possible (does the recovery tool check all kind
of errors? Or does it skip, e.g., index pages)? Is there a way to know for
sure that there is no corruption?
6. We have tried on some occasions to run the recovery tool and re-import
the corrupted database, but at least on one occasion this gave us errors so
we were unable to restore the data. Unfortunately we do not have the error
output anymore.
7. The next time this happens, is there anything that we should check
(e.g., the trace file)?
I'll include some of the stacktraces, maybe this can give you an indication
of what might have gone wrong.
Thanx for your answers and/or tips.
Kind regards,
Rob.
--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database+***@googlegroups.com.
To post to this group, send email to h2-***@googlegroups.com.
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database+***@googlegroups.com.
To post to this group, send email to h2-***@googlegroups.com.
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.