Blob.js
An HTML5 Blob implementation
I'm struggling to find a way to insert LARGE images (>100MB, mostly TIFF format) in my Oracle database, using BLOB columns.
I've searched thoroughly across the web and even in StackOverflow, without being able to find an answer to this problem.
First of all, the problem...then a short section on the relevant code (java classes/configuration), finally a third section where i show the junit test i've written to test image persistence (i receive the error during my junit test execution)
Edit: i've added a section, at the end of the question, where i describe some tests and analysis using JConsole
The problem
I receive an java.lang.OutOfMemoryError: Java heap space
error using hibernate and trying to persist very large images/documents:
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:133)
at org.hibernate.type.descriptor.java.DataHelper.extractBytes(DataHelper.java:190)
at org.hibernate.type.descriptor.java.BlobTypeDescriptor.unwrap(BlobTypeDescriptor.java:123)
at org.hibernate.type.descriptor.java.BlobTypeDescriptor.unwrap(BlobTypeDescriptor.java:47)
at org.hibernate.type.descriptor.sql.BlobTypeDescriptor$4$1.doBind(BlobTypeDescriptor.java:101)
at org.hibernate.type.descriptor.sql.BasicBinder.bind(BasicBinder.java:91)
at org.hibernate.type.AbstractStandardBasicType.nullSafeSet(AbstractStandardBasicType.java:283)
at org.hibernate.type.AbstractStandardBasicType.nullSafeSet(AbstractStandardBasicType.java:278)
at org.hibernate.type.AbstractSingleColumnStandardBasicType.nullSafeSet(AbstractSingleColumnStandardBasicType.java:89)
at org.hibernate.persister.entity.AbstractEntityPersister.dehydrate(AbstractEntityPersister.java:2184)
at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2430)
at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2874)
at org.hibernate.action.EntityInsertAction.execute(EntityInsertAction.java:79)
at org.hibernate.engine.ActionQueue.execute(ActionQueue.java:273)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:265)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:184)
at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:321)
at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:51)
at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1216)
at it.paoloyx.blobcrud.manager.DocumentManagerTest.testInsertDocumentVersion(DocumentManagerTest.java:929)
The code (domain objects, repository classes, configuration)
Here is the stack of technologies i'm using (from DB to business logic tier). I use JDK6.
- Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
- ojdbc6.jar (for 11.2.0.3 release)
- Hibernate 4.0.1 Final
- Spring 3.1.GA RELEASE
I've two domain classes, mapped in a one-to-many fashion. A DocumentVersion
has many DocumentData
, each of one can represent different binary content for the same DocumentVersion
.
Relevant extract from DocumentVersion
class:
@Entity
@Table(name = "DOCUMENT_VERSION")
public class DocumentVersion implements Serializable {
private static final long serialVersionUID = 1L;
private Long id;
private Set<DocumentData> otherDocumentContents = new HashSet<DocumentData>(0);
@Id
@GeneratedValue(strategy = GenerationType.TABLE)
@Column(name = "DOV_ID", nullable = false)
public Long getId() {
return id;
}
@OneToMany
@Cascade({ CascadeType.SAVE_UPDATE })
@JoinColumn(name = "DOD_DOCUMENT_VERSION")
public Set<DocumentData> getOtherDocumentContents() {
return otherDocumentContents;
}
Relevant extract from DocumentData
class:
@Entity
@Table(name = "DOCUMENT_DATA")
public class DocumentData {
private Long id;
/**
* The binary content (java.sql.Blob)
*/
private Blob binaryContent;
@Id
@GeneratedValue(strategy = GenerationType.TABLE)
@Column(name = "DOD_ID", nullable = false)
public Long getId() {
return id;
}
@Lob
@Column(name = "DOD_CONTENT")
public Blob getBinaryContent() {
return binaryContent;
}
Here are my Spring and Hibernate configuration main parameters:
<bean id="sessionFactory"
class="org.springframework.orm.hibernate4.LocalSessionFactoryBean">
<property name="dataSource" ref="dataSource" />
<property name="packagesToScan" value="it.paoloyx.blobcrud.model" />
<property name="hibernateProperties">
<props>
<prop key="hibernate.dialect">org.hibernate.dialect.Oracle10gDialect</prop>
<prop key="hibernate.hbm2ddl.auto">create</prop>
<prop key="hibernate.jdbc.batch_size">0</prop>
<prop key="hibernate.jdbc.use_streams_for_binary">true</prop>
</props>
</property>
</bean>
<bean class="org.springframework.orm.hibernate4.HibernateTransactionManager"
id="transactionManager">
<property name="sessionFactory" ref="sessionFactory" />
</bean>
<tx:annotation-driven transaction-manager="transactionManager" />
My datasource definition:
<bean class="org.apache.commons.dbcp.BasicDataSource"
destroy-method="close" id="dataSource">
<property name="driverClassName" value="${database.driverClassName}" />
<property name="url" value="${database.url}" />
<property name="username" value="${database.username}" />
<property name="password" value="${database.password}" />
<property name="testOnBorrow" value="true" />
<property name="testOnReturn" value="true" />
<property name="testWhileIdle" value="true" />
<property name="timeBetweenEvictionRunsMillis" value="1800000" />
<property name="numTestsPerEvictionRun" value="3" />
<property name="minEvictableIdleTimeMillis" value="1800000" />
<property name="validationQuery" value="${database.validationQuery}" />
</bean>
where properties are taken from here:
database.driverClassName=oracle.jdbc.OracleDriver
database.url=jdbc:oracle:thin:@localhost:1521:devdb
database.username=blobcrud
database.password=blobcrud
database.validationQuery=SELECT 1 from dual
I've got a service class, that delegates to a repository class:
@Transactional
public class DocumentManagerImpl implements DocumentManager {
DocumentVersionDao documentVersionDao;
public void setDocumentVersionDao(DocumentVersionDao documentVersionDao) {
this.documentVersionDao = documentVersionDao;
}
and now the relevant extracts from repository classes:
public class DocumentVersionDaoHibernate implements DocumentVersionDao {
@Autowired
private SessionFactory sessionFactory;
@Override
public DocumentVersion saveOrUpdate(DocumentVersion record) {
this.sessionFactory.getCurrentSession().saveOrUpdate(record);
return record;
}
The JUnit test that causes the error
If i run the following unit test i've got the aforementioned error (java.lang.OutOfMemoryError: Java heap space
):
@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(locations = { "classpath*:META-INF/spring/applicationContext*.xml" })
@Transactional
public class DocumentManagerTest {
@Autowired
protected DocumentVersionDao documentVersionDao;
@Autowired
protected SessionFactory sessionFactory;
@Test
public void testInsertDocumentVersion() throws SQLException {
// Original mock document content
DocumentData dod = new DocumentData();
// image.tiff is approx. 120MB
File veryBigFile = new File("/Users/paoloyx/Desktop/image.tiff");
try {
Session session = this.sessionFactory.getCurrentSession();
InputStream inStream = FileUtils.openInputStream(veryBigFile);
Blob blob = Hibernate.getLobCreator(session).createBlob(inStream, veryBigFile.length());
dod.setBinaryContent(blob);
} catch (IOException e) {
e.printStackTrace();
dod.setBinaryContent(null);
}
// Save a document version linked to previous document contents
DocumentVersion dov = new DocumentVersion();
dov.getOtherDocumentContents().add(dod);
documentVersionDao.saveOrUpdate(dov);
this.sessionFactory.getCurrentSession().flush();
// Clear session, then try retrieval
this.sessionFactory.getCurrentSession().clear();
DocumentVersion dbDov = documentVersionDao.findByPK(insertedId);
Assert.assertNotNull("Il document version ritornato per l'id " + insertedId + " รจ nullo", dbDov);
Assert.assertNotNull("Il document version recuperato non ha associato contenuti aggiuntivi", dbDov.getOtherDocumentContents());
Assert.assertEquals("Il numero di contenuti secondari non corrisponde con quello salvato", 1, dbDov.getOtherDocumentContents().size());
}
The same code works against a PostreSQL 9 installation. The images is being written in the database.
Debugging my code, i've been able to find that the PostgreSQL jdbc drivers writes on the database using a buffered output stream....while the Oracle OJDBC driver tries to allocate all at once all the byte[]
representing the image.
From the error stack:
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:133)
Is the error due to this behavior?
Can anyone give me some insights on this problem?
Thanks everyone.
Memory Tests with JConsole
Thanks to the suggestions received for my question, i've tried to do some simple tests to show memory usage of my code using two different jdbc drivers, one for PostgreSQL and one for Oracle.
Test setup:
- The test has been conducted using the JUnit test described in the previous section.
- JVM Heap Size has been set to 512MB, using parameter -Xmx512MB
- For Oracle database, I've used ojdbc6.jar driver
- For Postgres database, I've used 9.0-801.jdbc3 driver (via Maven)
First test, with a file of approx 150MB
In this first test, both Oracle and Postgres passed the test (this is BIG news).
The file is sized 1/3 of available JVM heap size.
Here the picture of JVM memory consumption:
Testing Oracle, 512MB Heap Size, 150MB file

Testing PostgreSQL, 512MB Heap Size, 150MB file

Second test, with a file of approx 485MB
In this second test, only Postgres passed the test and Oracle failed it .
The file is sized very near the size of the available JVM heap space.
Here the picture of JVM memory consumption:
Testing Oracle, 512MB Heap Size, 485MB file

Testing PostgreSQL, 512MB Heap Size, 485MB file

Analysis of the tests:
It seems that PostgreSQL driver handles memory without surpassing a certain threshold, while Oracle driver behaves very differently.
I can't honestly explain why Oracle jdbc driver leads me to error (the same java.lang.OutOfMemoryError: Java heap space
) when using file sized near the available heap space.
Is there anyone that can give me more insights?
Thanks a lot for you help :)
Source: (StackOverflow)
While I was trying to create a workaround for Chrome unsupporting blobs in IndexedDB I discovered that I could read an image through AJAX as an arraybuffer, store it in IndexedDB, extract it, convert it to a blob and then show it in an element using the following code:
var xhr = new XMLHttpRequest(),newphoto;
xhr.open("GET", "photo1.jpg", true);
xhr.responseType = "arraybuffer";
xhr.addEventListener("load", function () {
if (xhr.status === 200) {
newphoto = xhr.response;
/* store "newphoto" in IndexedDB */
...
}
}
document.getElementById("show_image").onclick=function() {
var store = db.transaction("files", "readonly").objectStore("files").get("image1");
store.onsuccess = function() {
var URL = window.URL || window.webkitURL;
var oMyBlob = new Blob([store.result.image], { "type" : "image\/jpg" });
var docURL = URL.createObjectURL(oMyBlob);
var elImage = document.getElementById("photo");
elImage.setAttribute("src", docURL);
URL.revokeObjectURL(docURL);
}
}
This code works fine. But if I try the same process, but this time loading a video (.mp4) I can't show it:
...
var oMyBlob = new Blob([store.result.image], { "type" : "video\/mp4" });
var docURL = URL.createObjectURL(oMyBlob);
var elVideo = document.getElementById("showvideo");
elVideo.setAttribute("src", docURL);
...
<video id="showvideo" controls ></video>
...
Even if I use xhr.responseType = "blob" and not storing the blob in IndexedDB but trying to show it immediately after loading it, it still does not works!
xhr.responseType = "blob";
xhr.addEventListener("load", function () {
if (xhr.status === 200) {
newvideo = xhr.response;
var docURL = URL.createObjectURL(newvideo);
var elVideo = document.getElementById("showvideo");
elVideo.setAttribute("src", docURL);
URL.revokeObjectURL(docURL);
}
}
The next step was trying to do the same thing for PDF files, but I'm stuck with video files!
Source: (StackOverflow)
How can I insert an image in MySQL and then retrieve it using PHP?
I have limited experience in either area, and I could use a little code to get me started in figuring this out.
Source: (StackOverflow)
I have been pretty much fascinated by these two data types . According to Oracle Docs, they are presented as follows :
BLOB : Variable-length binary large object string that can be up to 2GB (2,147,483,647) long. Primarily intended to hold non-traditional data, such as voice or mixed media. BLOB strings are not associated with a character set, as with FOR BIT DATA strings.
CLOB : Variable-length character large object string that can be up to 2GB (2,147,483,647) long. A CLOB can store single-byte character strings or multibyte, character-based data. A CLOB is considered a character string.
But, what I don't know, is whether there is any difference between the two from DB2 and Oracle perspective ? I mean, what are the differences between DB2 CLOB and Oracle CLOB , also between DB2 BLOB and Oracle BLOB ? What is the maximum size of both in DB2 and Oracle ? Is it just 2 GB ?
Any help is greatly appreciated !
Source: (StackOverflow)
How do you set the name of a blob file in JavaScript when force downloading it through window.location?
function newFile(data) {
var json = JSON.stringify(data);
var blob = new Blob([json], {type: "octet/stream"});
var url = window.URL.createObjectURL(blob);
window.location.assign(url);
}
Running the above code downloads a file instantly without a page refresh that looks like bfefe410-8d9c-4883-86c5-d76c50a24a1d. I want to set the filename as my-download.json instead.
Source: (StackOverflow)
I am developping an application where I recieve image data stored in a uint8Array.
I then transform this data to a Blob and then build the image url.
Simplified code to get data from server:
var array;
var req = new XMLHttpRequest();
var url = "img/" + uuid + "_" +segmentNumber+".jpg";
req.open("GET", url, true);
req.responseType = "arraybuffer";
req.onload = function(oEvent) {
var data = req.response;
array = new Int8Array(data);
};
Constructor:
out = new Blob([data], {type : datatype} );
The Blob contsructor is causing problem. It works fine on all browsers except Chrome on mobile and desktop devices.
Use of Blob:
// Receive Uint8Array using AJAX here
// array = ...
// Create BLOB
var jpeg = new Blob( [array.buffer], {type : "image/jpeg"});
var url = DOMURL.createObjectURL(jpeg);
img.src = url;
Desktop Chrome gives me a warnning : ArrayBuffer values are deprecated in Blob Constructor. Use ArrayBufferView instead.
Mobile Chrome gives me an error: illegal constructor
If I change the constructor to work on Chrome it fails on other browsers.
Source: (StackOverflow)
I have an sqlite table that contains a BLOB file, but need to do a size/length check on the blob, how do I do that?
According to some documentation I did find, using length(blob) won't work, because length() only works on texts and will stop counting after the first NULL. My empirical tests have shown this to be true.
I'm using SQLite 3.4.2
Updates:
So as of SQLite 3.7.6 it appears as though the length() function returns the correct value of blobs - I checked various change-logs of sqlite, but did not see in what version this was corrected.
From Sqlite 3.7.6:
payload_id|length(payload)|length(hex(payload))/2
1807913|194|194
1807914|171|171
The documentation was changed to reflect this.
length(X) The length(X) function returns the length of X in characters if X is
a string, or in bytes if X is a blob. If X is NULL then length(X) is
NULL. If X is numeric then length(X) returns the length of a string
representation of X.
Source: (StackOverflow)
Currently, I've got images (max. 6MB) stored as BLOB in a InnoDB table.
As the size of the data is growing, the nightly backup is growing slower and slower hindering normal performance.
So, the binary data needs to go to the file system. (pointers to the files will be kept in the DB.)
The data has a tree like relation:
- main site
- user_0
- album_0
- album_1
- album_n
- user_1
- user_n
etc...
Now I want the data to be distributed evenly trough the directory structure. How should I accomplish this?
I guess I could try MD5('userId, albumId, imageId');
and slice up the resulting string to get my directory path:
/var/imageStorage/f/347e/013b/c042/51cf/985f7ad0daa987d.jpeg
This would allow me to map the first character to a server and evenly distribute the directory structure over multiple servers.
This would however not keep images organised per user, likely spreading the images for 1 album over multiple servers.
My question is:
What is the best way to store the image data in the file system in a balanced way, while keeping user/album data together ?
Am I thinking in the right direction? or is this the wrong way of doing things altogether?
Update:
I will go for the md5(user_id)
string slicing for the split up on highest level.
And then put all user data in that same bucket. This will ensure an even distribution of data while keeping user data stored close together.
/var
- imageStorage
- f/347e/013b
- f347e013bc04251cf985f7ad0daa987d
- 0
- album1_10
- picture_1.jpeg
- 1
- album1_1
- picture_2.jpeg
- picture_3.jpeg
- album1_11
- picture_n.jpeg
- n
- album1_n
I think I will use albumId splitted up from behind (I like that idea!) as to keep the number of albums per directory smaller (although it won't be necessary for most users).
Thanks!
Source: (StackOverflow)
Sorry for the Noob Question, but what does the circled button mean, and how can I view the content of a blob?

Source: (StackOverflow)
I was studying Blobs, and I noticed that when you have an ArrayBuffer, you can easily convert this to a Blob as follows:
var dataView = new DataView(arrayBuffer);
var blob = new Blob([dataView], { type: mimeString });
The question I have now is, is it possible to go from a Blob to an ArrayBuffer?
Source: (StackOverflow)
When I search the web for inserting BLOBs into Oracle database with jdbc thin driver, most of the webpages suggest a 3-step approach:
- insert
empty_blob()
value.
- select the row with
for update
.
- insert the real value.
This works fine for me, here is an example:
Connection oracleConnection = ...
byte[] testArray = ...
PreparedStatement ps = oracleConnection.prepareStatement(
"insert into test (id, blobfield) values(?, empty_blob())");
ps.setInt(1, 100);
ps.executeUpdate();
ps.close();
ps = oracleConnection.prepareStatement(
"select blobfield from test where id = ? for update");
ps.setInt(1, 100);
OracleResultSet rs = (OracleResultSet) ps.executeQuery();
if (rs.next()) {
BLOB blob = (BLOB) rs.getBLOB(1);
OutputStream outputStream = blob.setBinaryStream(0L);
InputStream inputStream = new ByteArrayInputStream(testArray);
byte[] buffer = new byte[blob.getBufferSize()];
int byteread = 0;
while ((byteread = inputStream.read(buffer)) != -1) {
outputStream.write(buffer, 0, byteread);
}
outputStream.close();
inputStream.close();
}
There are some webpages where the authors suggest using a simpler 1-step solution. Previous example with this solution:
Connection oracleConnection = ...
byte[] testArray = ...
PreparedStatement ps = oracleConnection.prepareStatement(
"insert into test(id, blobfield) values(?, ?)");
BLOB blob = BLOB.createTemporary(oracleConnection, false, BLOB.DURATION_SESSION);
OutputStream outputStream = blob.setBinaryStream(0L);
InputStream inputStream = new ByteArrayInputStream(testArray);
byte[] buffer = new byte[blob.getBufferSize()];
int byteread = 0;
while ((byteread = inputStream.read(buffer)) != -1) {
outputStream.write(buffer, 0, byteread);
}
outputStream.close();
inputStream.close();
ps.setInt(1, 100);
ps.setBlob(2, blob);
ps.executeUpdate();
ps.close();
The second code is much more easier, so my question is: What is the point of first (popular) solution? Is there (was there) some kind of constraint for the second solution (Oracle server version number, jdbc driver version, size of the blob,...)? Is the first solution better (speed, memory consumption,...)? Any reasons for not using the simpler second approach?
The exact same question applies for CLOB fields.
Source: (StackOverflow)
I've got some byte[]
fields in my entities, e.g.:
@Entity
public class ServicePicture implements Serializable {
private static final long serialVersionUID = 2877629751219730559L;
// seam-gen attributes (you should probably edit these)
@Id
@GeneratedValue
private Long id;
private String description;
@Lob
@Basic(fetch = FetchType.LAZY)
private byte[] picture;
On my database schema the field is set to BLOB
so this should be fine. Anyway: Everytime when I try to insert a picture or pdf - nothing bigger than 1mb
, I only recieve this
16:52:27,327 WARN [JDBCExceptionReporter] SQL Error: 0, SQLState: 22001
16:52:27,327 ERROR [JDBCExceptionReporter] Data truncation: Data too long for column 'picture' at row 1
16:52:27,328 ERROR [STDERR] javax.persistence.PersistenceException: org.hibernate.exception.DataException: could not insert: [de.ac.dmg.productfinder.entity.ServicePicture]
16:52:27,328 ERROR [STDERR] at org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:629)
16:52:27,328 ERROR [STDERR] at org.hibernate.ejb.AbstractEntityManagerImpl.persist(AbstractEntityManagerImpl.java:218)
16:52:27,328 ERROR [STDERR] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
16:52:27,328 ERROR [STDERR] at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
16:52:27,328 ERROR [STDERR] at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
16:52:27,328 ERROR [STDERR] at java.lang.reflect.Method.invoke(Unknown Source)
16:52:27,328 ERROR [STDERR] at org.jboss.seam.persistence.EntityManagerInvocationHandler.invoke(EntityManagerInvocationHandler.java:46)
16:52:27,328 ERROR [STDERR] at $Proxy142.persist(Unknown Source)
I've checked my MySQL cnf and the max_allowed
param is set to 16M
- am I missing something?
Source: (StackOverflow)
I have been trying to display pdf file which I am getting as a blob from a $http.post
response. The pdf must be displayed within the app using <embed src>
for example.
I came across couple of stack posts but some how my example doesnt seem to work.
JS:
according to this doc, I went on and tried...
$http.post('/postUrlHere',{myParams}).success(function (response) {
var file = new Blob([response], {type: 'application/pdf'});
var fileURL = URL.createObjectURL(file);
$scope.content = fileURL;
});
Now from what I understand, fileURL
creates a temporary url that the blog can use as refference.
HTML:
<embed src="{{content}}" width="200" height="200"></embed>
I am not sure how to handle this in angular, the ideal situation would be to (1) assign it to a scope, (2) 'prepare/rebuild' the blob to a pdf (3) pass it to the HTML using <embed>
cause I want to display it within the app.
I have been researching for more than a day now but some how I cant seem to understand how this works in angular.. and lets just assume the pdf viewer libraries out there weren't an option.
Source: (StackOverflow)