Top ssd frequently asked interview questions

Has anyone worked with Aerospike? How does it compare to MongoDB?

Can anyone say if Aerospike is as good as they claim it to be? I'm a bit skeptical since it's a commercial enterprise. As far as I understand they just released a open source version, but the claims on their website could still be exaggerated.

I'm especially interested on how Aerospike compares to MongoDB.

Source: (StackOverflow)

Is there any way of detecting if a drive is a SSD?

I'm getting ready to release a tool that is only effective with regular hard drives, not SSD (solid state drive). In fact, it shouldn't be used with SSD's because it will result in a lot of read/writes with no real effectiveness.

Anyone knows of a way of detecting if a given drive is solid-state?

Source: (StackOverflow)

Cassandra with SSD - two disks or one?

It's generally recommended that cassandra use two separate disks: one for the commit log and the other for everything else.

However, in what appears to be a recent update to the configuration guidelines, the following phrase appears:

For SSDs it is recommended that both commit logs and SSTables are on the same mount point.

Can anyone explain why it's recommended to only use one disk if it's SSD? Thanks.

Source: (StackOverflow)

SSD and programming [closed]

I'm trying to put together a business case for getting every developer in our company an SSD drive.

The main codebase contains roughly 400,000 lines of code. My theory is that since the code is scattered about in maybe 1500 files, an SSD drive would be substantially faster for compiles. The logic being that many small reads really punishes the seek-time bottle-neck of a traditional hard-drive.

Am I right? Is SSD worth the money in productivity gains by reducing the edit/compile cycle time?

Source: (StackOverflow)

Is it possible for code to change without Git knowing about it?

This morning I ran my tests and there are 2 failures. But I haven't changed any code for a few days and all tests were passing.

According to Git, there are no changes (except for coverage.data, which is the test output). gitk shows no other changes.

How does Git know when code changes? Could this be caused by an SSD failure/error?

What is the best way to figure out what happened?

EDIT: Working in Ruby on Rails with Unit Test framework.

Source: (StackOverflow)

Tell if a path refers to a solid state drive with WinAPI [duplicate]

This question already has an answer here:

Is there any way of detecting if a drive is a SSD? 7 answers

I have a drive path "C:\" is there any way to find out if the actual drive is an older HDD or a Solid State Drive?

I need to do this using unmanaged code and C++.

Source: (StackOverflow)

High performance persistent key value store for huge amount of records

The scenario is about 1 billion records. Each record has 1kb data size and is store in SSD. Which kv store can provide best random read performance? It need to reduce disk access to only 1 time per query and all of the data index will be stored in memory.

Redis is fast but it's too expensive to store 1 TB data in memory. LevelDB reads disk several times per query. The closest one I found is fatcache but it's not persistent. It's an SSD-backed memcached.

Any suggestions?

Source: (StackOverflow)

Microsoft Team Foundation Server (SqlExpress) -- How to ensure that the data is being saved on my standard HDD instead of my SSD? [closed]

Basically, I installed Microsoft Visual Studio and Team Foundation Server with the default settings. It's blazing fast, which probably means that it is reading/writing from my SSD. I'd actually like to get it to work from my standard Hard Drive to avoid burning through my SSD too quickly... how would one go about this?

I should note that my SSD is my C: drive and my standard HDD is my H: (data) drive. TFS/SqlServer/VS2010 were all installed in the standard Program Files location (which reside on my SSD).

Source: (StackOverflow)

Why does Java disk I/O perform so much slower than the equivalent I/O code written in C?

I have an SSD disk which should supply not less than 10k IOPS per specification. My benchmark confirms that it can give me 20k IOPS.

Then I create such a test:

private static final int sector = 4*1024;
private static byte[] buf = new byte[sector];
private static int duration = 10; // seconds to run
private static long[] timings = new long[50000];
public static final void main(String[] args) throws IOException {
    String filename = args[0];
    long size = Long.parseLong(args[1]);
    RandomAccessFile raf = new RandomAccessFile(filename, "r");
    Random rnd = new Random();
    long start = System.currentTimeMillis();
    int ios = 0;
    while (System.currentTimeMillis()-start<duration*1000) {
        long t1 = System.currentTimeMillis();
        long pos = (long)(rnd.nextDouble()*(size>>12));
        raf.seek(pos<<12);
        int count = raf.read(buf);
        timings[ios] = System.currentTimeMillis() - t1;
        ++ios;
    }
    System.out.println("Measured IOPS: " + ios/duration);
    int totalBytes = ios*sector;
    double totalSeconds = (System.currentTimeMillis()-start)/1000.0;
    double speed = totalBytes/totalSeconds/1024/1024;
    System.out.println(totalBytes+" bytes transferred in "+totalSeconds+" secs ("+speed+" MiB/sec)");
    raf.close();
    Arrays.sort(timings);
    int l = timings.length;
    System.out.println("The longest IO = " + timings[l-1]);
    System.out.println("Median duration = " + timings[l-(ios/2)]);
    System.out.println("75% duration = " + timings[l-(ios * 3 / 4)]);
    System.out.println("90% duration = " + timings[l-(ios * 9 / 10)]);
    System.out.println("95% duration = " + timings[l-(ios * 19 / 20)]);
    System.out.println("99% duration = " + timings[l-(ios * 99 / 100)]);
}

And then I run this example and get just 2186 IOPS:

$ sudo java -cp ./classes NioTest /dev/disk0 240057409536
Measured IOPS: 2186
89550848 bytes transferred in 10.0 secs (8.540234375 MiB/sec)
The longest IO = 35
Median duration = 0
75% duration = 0
90% duration = 0
95% duration = 0
99% duration = 0

Why does it work so much slower than same test in C?

Update: here is Python code which gives 20k IOPS:

def iops(dev, blocksize=4096, t=10):

    fh = open(dev, 'r')
    count = 0
    start = time.time()
    while time.time() < start+t:
        count += 1
        pos = random.randint(0, mediasize(dev) - blocksize) # need at least one block left
        pos &= ~(blocksize-1)   # sector alignment at blocksize
        fh.seek(pos)
        blockdata = fh.read(blocksize)
    end = time.time()
    t = end - start
    fh.close()

Update2: NIO code (just a piece, will not duplicate all the method)

...
RandomAccessFile raf = new RandomAccessFile(filename, "r");
InputStream in = Channels.newInputStream(raf.getChannel());
...
int count = in.read(buf);
...

Source: (StackOverflow)

Tuning SSD MySql Performance

I'm testing SSDs for use with MySql and am unable to see any performance benefits. This makes me feel like I must be doing something wrong.

Here is the setup:

Xeon 5520 2.26 Ghz Quad Core
12 GB Ram
300GB 15k in RAID 1
64GB SSD in RAID 1

For the test I moved the mysql directory on the SSD.

I imported a table with 3 million rows. Then imported the same table with the data and index directories symlinked to the 15k drive.

Loading the data into the tables via a dump from mysqldump the 15k drives showed a faster insert rate over the SSDs:

15k ~= 35,800 inserts/sec
SSD != 27,000 inserts/sec

Then I tested SELECT speed by doing 'SELECT * FROM table INTO OUTFILE '/tmp/table.txt':

15kk ~= 3,000,000 rows in 4.19 seconds
SSD ~= 3,000,000 rows in 4.21 seconds

The SELECTS were almost identical and the writes were actually slower on the SSD which does not seem right at all. Any thoughts on what I should look into next?

Extra note: I tuned the SSD with the standard changes: noatime and noob-scheduler

Source: (StackOverflow)

Parallel I/O SSD vs HDD surprising results

I have a very strange situation happening with some of my tests regarding paralell I/O. Here is the situation.. I have multiple threads opening a file handler to the same file and reading from multiple locations of the file (evenly spaced intervals) a finite number of bytes and dumping that into an array. All is done with boost threads. Now, I assume with an HDD that should be slower due to the random access seeking. This is why my tests are in fact targeted towards SSD. Turns out I almost do not get any speedup when reading the same file from a solid state disk compared to a HDD. Wonder what the problem might be? Does that seem very surprising just to me / I am also posting my code below to see what I am exactly doing:

    void readFunctor(std::string pathToFile, size_t filePos, BYTE* buffer, size_t buffPos, size_t dataLn, boost::barrier& barier) {

        FILE* pFile;
        pFile = fopen(pathToFile.c_str(), "rb");

        fseek(pFile, filePos, SEEK_SET);
        fread(buffer, sizeof(BYTE), dataLn, pFile);

        fclose(pFile);
        barier.wait();

    }

    void joinAllThreads(std::vector<boost::shared_ptr<boost::thread> > &threads) {

        for (std::vector<boost::shared_ptr<boost::thread> >::iterator it = threads.begin(); it != threads.end(); ++it) {
            (*it).get()->join();

        }

    }

    void readDataInParallel(BYTE* buffer, std::string pathToFile, size_t lenOfData, size_t numThreads) {
        std::vector<boost::shared_ptr<boost::thread> > threads;
        boost::barrier barier(numThreads);
        size_t dataPerThread = lenOfData / numThreads;

        for (int var = 0; var < numThreads; ++var) {
            size_t filePos = var * dataPerThread;
            size_t bufferPos = var * dataPerThread;
            size_t dataLenForCurrentThread = dataPerThread;
            if (var == numThreads - 1) {
                dataLenForCurrentThread = dataLenForCurrentThread + (lenOfData % numThreads);
            }

            boost::shared_ptr<boost::thread> thread(
                    new boost::thread(readFunctor, pathToFile, filePos, buffer, bufferPos, dataLenForCurrentThread, boost::ref(barier)));
            threads.push_back(thread);

        }

        joinAllThreads(threads);

    }

Now.. in my main file I pretty much have..:

    int start_s = clock();
    size_t sizeOfData = 2032221073;
    boost::shared_ptr<BYTE> buffer((BYTE*) malloc(sizeOfData));
    readDataInParallel(buffer.get(), "/home/zahari/Desktop/kernels_big.dat", sizeOfData, 4);
    clock_t stop_s = clock();
    printf("%f %f\n", ((double) start_s / (CLOCKS_PER_SEC)) * 1000, (stop_s / double(CLOCKS_PER_SEC)) * 1000);

Surprisingly, when reading from SSD, I do not get any speedup compared to HDD? Why might that be?

Source: (StackOverflow)

Visual Studio 2010 with SSD Performance [closed]

I have a Core i5 laptop with 4Gbyte RAM and I use Windows 7 32 bit operation system.

I would like to improve my laptop performance, but unfortunately I can't upgrade Win 7 to 64 bit.

My question is, if I change the HDD to an SSD drive (Sata III, read, and write speed is over than 500Mbyte/sec), this changes will improve the Visual Studio 2010 performance?

Anybody has any experience about it?

I read some article about it, and some people say it would not be a big improvement, I should change the CPU speed, but some people say the compiling of Visual Studio is 2 times faster on SSD than HDD.

Thanks for your answers!

I have a company laptop so I can't upgrade the RAM, because the official operation system is Win 7 32 bit and it is not able to use more than 4GByte. (There is 4 GByte RAM in my laptop).

Source: (StackOverflow)

Reading a file line by line -- impact on disk?

I'm currently writing a python script that processes very large (> 10GB) files. As loading the whole file into memory is not an option, I'm right now reading and processing it line by line:

for line in f:
....

Once the script is finished it will run fairly often, so I'm starting to think about what impact that sort of reading will have on my disks lifespan.

Will the script actually read line by line or is there some kind of OS-powered buffering happening? If not, should I implement some kind of intermediary buffer myself? Is hitting the disk that often actually harmful? I remember reading something about BitTorrent wearing out disks quickly exactly because of that kind of bitwise reading/writing rather than operating with larger chunks of data.

I'm using both a HDD and an SSD in my test environment, so answers would be interesting for both systems.

Source: (StackOverflow)

How do I simulate normal hard drive performance on my SSD based Mac?

What I Want

I want to simulate the performance of a normal hard drive on my SSD based development machine.

Background

I'm developing a Mac application on a Macbook with an SSD. It's gloriously fast.

If someone has a standard platter hard drive, my app will be slower for them. My app is heavy on Core Data too, so the disk access speed will be a significant factor.

I worry that the performance measurements I take with Instruments look fine, but when a customer runs my app on their normal hard drive it will be achingly slow.

What I've Tried

Before I installed my SSD, I measured the performance of my app in Instruments. After the install, I measured it again and the two benchmarks were identical.

This doesn't make sense to me. I'm convinced I was doing something wrong here. Instruments probably measures clock speed, not wall time speed. But still, surely the speed of the hard drive should affect the benchmark I took? Or does Instruments somehow compensate for this?

Source: (StackOverflow)

sqlite updates slow (15 seconds for 1720 records) on SSD disk

Dear fellow developer, for some reason updates for 1720 records takes around 15 seconds when on SSD disk (especially when having trim enabled).

I have tweaked the sqlite settings using the following document (which works well) http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html

I have the following PRAGMA's set to optimize performance and I DO use transactions around the complete set of updates.

sqlite3_exec(database, "PRAGMA cache_size=500000;", nil, nil, nil);
sqlite3_exec(database, "PRAGMA synchronous=OFF", nil, nil, nil);
sqlite3_exec(database, "PRAGMA count_changes=OFF", nil, nil, nil);
sqlite3_exec(database, "PRAGMA temp_store=MEMORY", nil, nil, nil);

It seems the SSD is doing too much (like deleting blocks and such) which makes it block for 15 seconds for just 1720 simple records update.

Weirdly enough: inserting 2500 records is almost instant. Can you help me and give me some pointers how to fix this?

Source: (StackOverflow)

EzDevInfo.com

ssd interview questions

Has anyone worked with Aerospike? How does it compare to MongoDB?

Is there any way of detecting if a drive is a SSD?

Cassandra with SSD - two disks or one?

SSD and programming [closed]

Is it possible for code to change without Git knowing about it?

Tell if a path refers to a solid state drive with WinAPI [duplicate]

High performance persistent key value store for huge amount of records

Microsoft Team Foundation Server (SqlExpress) -- How to ensure that the data is being saved on my standard HDD instead of my SSD? [closed]

Why does Java disk I/O perform so much slower than the equivalent I/O code written in C?

Tuning SSD MySql Performance

Parallel I/O SSD vs HDD surprising results

Visual Studio 2010 with SSD Performance [closed]

Reading a file line by line -- impact on disk?

How do I simulate normal hard drive performance on my SSD based Mac?

sqlite updates slow (15 seconds for 1720 records) on SSD disk