ssd interview questions
Top ssd frequently asked interview questions
Can anyone say if Aerospike is as good as they claim it to be? I'm a bit skeptical since it's a commercial enterprise. As far as I understand they just released a open source version, but the claims on their website could still be exaggerated.
I'm especially interested on how Aerospike compares to MongoDB.
Source: (StackOverflow)
I'm getting ready to release a tool that is only effective with regular hard drives, not SSD (solid state drive). In fact, it shouldn't be used with SSD's because it will result in a lot of read/writes with no real effectiveness.
Anyone knows of a way of detecting if a given drive is solid-state?
Source: (StackOverflow)
It's generally recommended that cassandra use two separate disks: one for the commit log and the other for everything else.
However, in what appears to be a recent update to the configuration guidelines, the following phrase appears:
For SSDs it is recommended that both commit logs and SSTables are on
the same mount point.
Can anyone explain why it's recommended to only use one disk if it's SSD?
Thanks.
Source: (StackOverflow)
I'm trying to put together a business case for getting every developer in our company an SSD drive.
The main codebase contains roughly 400,000 lines of code. My theory is that since the code is scattered about in maybe 1500 files, an SSD drive would be substantially faster for compiles. The logic being that many small reads really punishes the seek-time bottle-neck of a traditional hard-drive.
Am I right? Is SSD worth the money in productivity gains by reducing the edit/compile cycle time?
Source: (StackOverflow)
This morning I ran my tests and there are 2 failures. But I haven't changed any code for a few days and all tests were passing.
According to Git, there are no changes (except for coverage.data, which is the test output). gitk shows no other changes.
How does Git know when code changes? Could this be caused by an SSD failure/error?
What is the best way to figure out what happened?
EDIT: Working in Ruby on Rails with Unit Test framework.
Source: (StackOverflow)
This question already has an answer here:
I have a drive path "C:\" is there any way to find out if the actual drive is an older HDD or a Solid State Drive?
I need to do this using unmanaged code and C++.
Source: (StackOverflow)
The scenario is about 1 billion records. Each record has 1kb data size and is store in SSD.
Which kv store can provide best random read performance? It need to reduce disk access to only 1 time per query and all of the data index will be stored in memory.
Redis is fast but it's too expensive to store 1 TB data in memory.
LevelDB reads disk several times per query.
The closest one I found is fatcache but it's not persistent. It's an SSD-backed memcached.
Any suggestions?
Source: (StackOverflow)
Basically, I installed Microsoft Visual Studio and Team Foundation Server with the default settings. It's blazing fast, which probably means that it is reading/writing from my SSD. I'd actually like to get it to work from my standard Hard Drive to avoid burning through my SSD too quickly... how would one go about this?
I should note that my SSD is my C: drive and my standard HDD is my H: (data) drive. TFS/SqlServer/VS2010 were all installed in the standard Program Files location (which reside on my SSD).
Source: (StackOverflow)
I have an SSD disk which should supply not less than 10k IOPS per specification. My benchmark confirms that it can give me 20k IOPS.
Then I create such a test:
private static final int sector = 4*1024;
private static byte[] buf = new byte[sector];
private static int duration = 10; // seconds to run
private static long[] timings = new long[50000];
public static final void main(String[] args) throws IOException {
String filename = args[0];
long size = Long.parseLong(args[1]);
RandomAccessFile raf = new RandomAccessFile(filename, "r");
Random rnd = new Random();
long start = System.currentTimeMillis();
int ios = 0;
while (System.currentTimeMillis()-start<duration*1000) {
long t1 = System.currentTimeMillis();
long pos = (long)(rnd.nextDouble()*(size>>12));
raf.seek(pos<<12);
int count = raf.read(buf);
timings[ios] = System.currentTimeMillis() - t1;
++ios;
}
System.out.println("Measured IOPS: " + ios/duration);
int totalBytes = ios*sector;
double totalSeconds = (System.currentTimeMillis()-start)/1000.0;
double speed = totalBytes/totalSeconds/1024/1024;
System.out.println(totalBytes+" bytes transferred in "+totalSeconds+" secs ("+speed+" MiB/sec)");
raf.close();
Arrays.sort(timings);
int l = timings.length;
System.out.println("The longest IO = " + timings[l-1]);
System.out.println("Median duration = " + timings[l-(ios/2)]);
System.out.println("75% duration = " + timings[l-(ios * 3 / 4)]);
System.out.println("90% duration = " + timings[l-(ios * 9 / 10)]);
System.out.println("95% duration = " + timings[l-(ios * 19 / 20)]);
System.out.println("99% duration = " + timings[l-(ios * 99 / 100)]);
}
And then I run this example and get just 2186 IOPS:
$ sudo java -cp ./classes NioTest /dev/disk0 240057409536
Measured IOPS: 2186
89550848 bytes transferred in 10.0 secs (8.540234375 MiB/sec)
The longest IO = 35
Median duration = 0
75% duration = 0
90% duration = 0
95% duration = 0
99% duration = 0
Why does it work so much slower than same test in C?
Update: here is Python code which gives 20k IOPS:
def iops(dev, blocksize=4096, t=10):
fh = open(dev, 'r')
count = 0
start = time.time()
while time.time() < start+t:
count += 1
pos = random.randint(0, mediasize(dev) - blocksize) # need at least one block left
pos &= ~(blocksize-1) # sector alignment at blocksize
fh.seek(pos)
blockdata = fh.read(blocksize)
end = time.time()
t = end - start
fh.close()
Update2: NIO code (just a piece, will not duplicate all the method)
...
RandomAccessFile raf = new RandomAccessFile(filename, "r");
InputStream in = Channels.newInputStream(raf.getChannel());
...
int count = in.read(buf);
...
Source: (StackOverflow)
I'm testing SSDs for use with MySql and am unable to see any performance benefits. This makes me feel like I must be doing something wrong.
Here is the setup:
- Xeon 5520 2.26 Ghz Quad Core
- 12 GB Ram
- 300GB 15k in RAID 1
- 64GB SSD in RAID 1
For the test I moved the mysql directory on the SSD.
I imported a table with 3 million rows. Then imported the same table with the data and index directories symlinked to the 15k drive.
Loading the data into the tables via a dump from mysqldump the 15k drives showed a faster insert rate over the SSDs:
- 15k ~= 35,800 inserts/sec
- SSD != 27,000 inserts/sec
Then I tested SELECT speed by doing 'SELECT * FROM table INTO OUTFILE '/tmp/table.txt':
- 15kk ~= 3,000,000 rows in 4.19 seconds
- SSD ~= 3,000,000 rows in 4.21 seconds
The SELECTS were almost identical and the writes were actually slower on the SSD which does not seem right at all. Any thoughts on what I should look into next?
Extra note: I tuned the SSD with the standard changes: noatime and noob-scheduler
Source: (StackOverflow)
I have a very strange situation happening with some of my tests regarding paralell I/O. Here is the situation.. I have multiple threads opening a file handler to the same file and reading from multiple locations of the file (evenly spaced intervals) a finite number of bytes and dumping that into an array. All is done with boost threads. Now, I assume with an HDD that should be slower due to the random access seeking. This is why my tests are in fact targeted towards SSD. Turns out I almost do not get any speedup when reading the same file from a solid state disk compared to a HDD. Wonder what the problem might be? Does that seem very surprising just to me / I am also posting my code below to see what I am exactly doing:
void readFunctor(std::string pathToFile, size_t filePos, BYTE* buffer, size_t buffPos, size_t dataLn, boost::barrier& barier) {
FILE* pFile;
pFile = fopen(pathToFile.c_str(), "rb");
fseek(pFile, filePos, SEEK_SET);
fread(buffer, sizeof(BYTE), dataLn, pFile);
fclose(pFile);
barier.wait();
}
void joinAllThreads(std::vector<boost::shared_ptr<boost::thread> > &threads) {
for (std::vector<boost::shared_ptr<boost::thread> >::iterator it = threads.begin(); it != threads.end(); ++it) {
(*it).get()->join();
}
}
void readDataInParallel(BYTE* buffer, std::string pathToFile, size_t lenOfData, size_t numThreads) {
std::vector<boost::shared_ptr<boost::thread> > threads;
boost::barrier barier(numThreads);
size_t dataPerThread = lenOfData / numThreads;
for (int var = 0; var < numThreads; ++var) {
size_t filePos = var * dataPerThread;
size_t bufferPos = var * dataPerThread;
size_t dataLenForCurrentThread = dataPerThread;
if (var == numThreads - 1) {
dataLenForCurrentThread = dataLenForCurrentThread + (lenOfData % numThreads);
}
boost::shared_ptr<boost::thread> thread(
new boost::thread(readFunctor, pathToFile, filePos, buffer, bufferPos, dataLenForCurrentThread, boost::ref(barier)));
threads.push_back(thread);
}
joinAllThreads(threads);
}
Now.. in my main file I pretty much have..:
int start_s = clock();
size_t sizeOfData = 2032221073;
boost::shared_ptr<BYTE> buffer((BYTE*) malloc(sizeOfData));
readDataInParallel(buffer.get(), "/home/zahari/Desktop/kernels_big.dat", sizeOfData, 4);
clock_t stop_s = clock();
printf("%f %f\n", ((double) start_s / (CLOCKS_PER_SEC)) * 1000, (stop_s / double(CLOCKS_PER_SEC)) * 1000);
Surprisingly, when reading from SSD, I do not get any speedup compared to HDD? Why might that be?
Source: (StackOverflow)
I have a Core i5 laptop with 4Gbyte RAM and I use Windows 7 32 bit operation system.
I would like to improve my laptop performance, but unfortunately I can't upgrade Win 7 to 64 bit.
My question is, if I change the HDD to an SSD drive (Sata III, read, and write speed is over than 500Mbyte/sec), this changes will improve the Visual Studio 2010 performance?
Anybody has any experience about it?
I read some article about it, and some people say it would not be a big improvement, I should change the CPU speed, but some people say the compiling of Visual Studio is 2 times faster on SSD than HDD.
Thanks for your answers!
I have a company laptop so I can't upgrade the RAM, because the official operation system is Win 7 32 bit and it is not able to use more than 4GByte. (There is 4 GByte RAM in my laptop).
Source: (StackOverflow)
I'm currently writing a python script that processes very large (> 10GB) files. As loading the whole file into memory is not an option, I'm right now reading and processing it line by line:
for line in f:
....
Once the script is finished it will run fairly often, so I'm starting to think about what impact that sort of reading will have on my disks lifespan.
Will the script actually read line by line or is there some kind of OS-powered buffering happening? If not, should I implement some kind of intermediary buffer myself? Is hitting the disk that often actually harmful? I remember reading something about BitTorrent wearing out disks quickly exactly because of that kind of bitwise reading/writing rather than operating with larger chunks of data.
I'm using both a HDD and an SSD in my test environment, so answers would be interesting for both systems.
Source: (StackOverflow)
What I Want
I want to simulate the performance of a normal hard drive on my SSD based development machine.
Background
I'm developing a Mac application on a Macbook with an SSD. It's gloriously fast.
If someone has a standard platter hard drive, my app will be slower for them. My app is heavy on Core Data too, so the disk access speed will be a significant factor.
I worry that the performance measurements I take with Instruments look fine, but when a customer runs my app on their normal hard drive it will be achingly slow.
What I've Tried
Before I installed my SSD, I measured the performance of my app in Instruments. After the install, I measured it again and the two benchmarks were identical.
This doesn't make sense to me. I'm convinced I was doing something wrong here. Instruments probably measures clock speed, not wall time speed. But still, surely the speed of the hard drive should affect the benchmark I took? Or does Instruments somehow compensate for this?
Source: (StackOverflow)
Dear fellow developer, for some reason updates for 1720 records takes around 15 seconds when on SSD disk (especially when having trim enabled).
I have tweaked the sqlite settings using the following document (which works well)
http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html
I have the following PRAGMA's set to optimize performance and I DO use transactions around the complete set of updates.
sqlite3_exec(database, "PRAGMA cache_size=500000;", nil, nil, nil);
sqlite3_exec(database, "PRAGMA synchronous=OFF", nil, nil, nil);
sqlite3_exec(database, "PRAGMA count_changes=OFF", nil, nil, nil);
sqlite3_exec(database, "PRAGMA temp_store=MEMORY", nil, nil, nil);
It seems the SSD is doing too much (like deleting blocks and such) which makes it block for 15 seconds for just 1720 simple records update.
Weirdly enough: inserting 2500 records is almost instant.
Can you help me and give me some pointers how to fix this?
Source: (StackOverflow)