data-structures interview questions
Top data-structures frequently asked interview questions
I'm a beginner in C programming, but I was wondering what's the difference between using typedef when defining a structure versus not using typedef. It seems to me like there's really no difference, they accomplish the same.
struct myStruct{
int one;
int two;
};
vs.
typedef struct{
int one;
int two;
}myStruct;
Source: (StackOverflow)
I have a data structure that represents a directed graph, and I want to render that dynamically on an HTML page. These graphs will usually be just a few nodes, maybe ten at the very upper end, so my guess is that performance isn't going to be a big deal. Ideally, I'd like to be able to hook it in with jQuery so that users can tweak the layout manually by dragging the nodes around.
Note: I'm not looking for a charting library.
Source: (StackOverflow)
A long time ago, I bought a data structures book off the bargain table for $1.25. In it, the explanation for a hashing function said that it should ultimately mod by a prime number because of "the nature of math".
What do you expect from a $1.25 book?
Anyway, I've had years to think about the nature of math, and still can't figure it out.
Is the distribution of numbers truly more even when there are a prime number of buckets? Or is this an old programmer's tale that everyone accepts because everybody else accepts it?
Source: (StackOverflow)
Brief background: Many (most?) contemporary programming languages in widespread use have at least a handful of ADTs [abstract data types] in common, in particular,
string (a (sequence comprised of characters)
list (an ordered collection of values), and
map-based type (an unordered array that maps keys to values)
In the R programming language, the first two are implemented as character and vector, respectively.
When I began learning R, two things were obvious almost from the start: List is the most important data type in R (because it is the parent class for the R Data Frame), and second, I just couldn't understand how they worked, at least not well enough to use them correctly in my code.
For one thing, it seemed to me that R's List data type was a straightforward implementation of the map ADT (dictionary in Python, NSMutableDictionary in Objective C, hash in Perl and Ruby, object literal in Javascript, and so forth).
For instance, you create them just like you would a Python dictionary, by passing key-value pairs to a constructor (which in Python is dict not list):
>>> x = list("ev1"=10, "ev2"=15, "rv"="Group 1")
And you access the items of an R List just like you would those of a Python dictionary, e.g., x['ev1']. Likewise, you can retrieve just the 'keys' or just the 'values' by:
>>> names(x) # fetch just the 'keys' of an R list
"ev1" "ev2" "rv"
>>> unlist(x) # fetch just the 'values' of an R list
10 15 "Group1"
>>> x = list("a"=6, "b"=9, "c"=3)
>>> sum(unlist(x))
18
but R Lists are also unlike other map-type ADTs (from among the languages I've learned anyway). My guess is that this is a consequence of the initial spec for S, i.e., an intention to design a data/statistics DSL [domain-specific language] from the ground-up.
three significant differences between R Lists and mapping types in other languages in widespread use (e.g,. Python, Perl, JavaScript):
first, Lists in R are an ordered collection, just like vectors, even
though the values are keyed (ie, the keys can be any hashable value not just sequential integers). Nearly always, the mapping data type in
other languages is unordered.
second, Lists can be returned from functions even though you never passed in
a List when you called the function, and even though the function that returned the list
doesn't contain an (explicit) List constructor (Of course, you can deal with this in practice by wrapping the returned result in a call to unlist):
>>> x = strsplit(LETTERS[1:10], "") # passing in an object of type 'character'
>>> class(x) # returns 'list', not a vector of length 2
list
A third peculiar feature of R's Lists: it doesn't seem that they can
be members of another ADT, and if you try to do that then the
primary container is coerced to a list. E.g.,
>>> x = c(0.5, 0.8, 0.23, list(0.5, 0.2, 0.9), recursive=T)
>>> class(x)
list
my intention here is not to criticize the language or how it is documented; likewise, I'm not suggesting there is anything wrong with the List data structure or how it behaves. All I'm after is to correct is my understanding of how they work so I can correctly use them in my code.
Here are the sorts of things I'd like to better understand:
What are the rules which determine when a function call will return a List (e.g., strsplit expression recited above)?
If i don't explicitly assign names to a list (e.g., list(10,20,30,40)) are the default names just sequential integers beginning with 1? (I assume, but i am far from certain that the answer is yes, otherwise we wouldn't be able to coerce this type of List to a vector w/ a call to unlist.
why do these two different operators, [], and [[]], return the same result?
x = list(1, 2, 3, 4)
both expressions return "1":
x[1]
x[[1]]
why do these two expressions not return the same result?
x = list(1, 2, 3, 4)
x2 = list(1:4)
please don't point me to the R Documentation (?list
, R-intro
)--i have read it carefully and it does not help me answer the type of questions i recited just above.
(lastly, I recently learned of and began using an R Package (available on CRAN) called hash which implements conventional map-type behavior via an S4 class; i can certainly recommend this Package.)
Source: (StackOverflow)
I use LINQ to Objects instructions on an ordered array.
Which operations shouldn't I do to be sure the order of the array is not changed?
Source: (StackOverflow)
In a B tree you can store both keys and data in the internal/leaf nodes.
But in a B+ tree you have to store the data in the leaf nodes only.
Is there any advantage of doing the above in a B+ tree?
Why not use B trees instead of B+ trees everywhere?
As intuitively they seem much faster. I mean why do you need
to replicate the key(data) in a B+ tree?
Source: (StackOverflow)
Does anyone know how the built in dictionary type for python is implemented? My understanding is that it is some sort of hash table, but I haven't been able to find any sort of definitive answer.
Source: (StackOverflow)
I'm trying to answer two questions in a definitive list:
- What are the underlying data structures used for Redis?
- And what are the main advantages/disadvantages/use cases for each type?
So, I've read the Redis lists are actually implemented with linked lists. But for other types, I'm not able to dig up any information. Also, if someone were to stumble upon this question and not have a high level summary of the pros and cons of modifying or accessing different data structures, they'd have a complete list of when to best use specific types to reference as well.
Specifically, I'm looking to outline all types: string, list, set, zset and hash.
Oh, I've looked at these article, among others, so far:
Source: (StackOverflow)
Say you have a linked list structure in Java. It's made up of Nodes:
class Node {
Node next;
// some user data
}
and each Node points to the next node, except for the last Node, which has null for next. Say there is a possibility that the list can contain a loop - i.e. the final Node, instead of having a null, has a reference to one of the nodes in the list which came before it.
What's the best way of writing
boolean hasLoop(Node first)
which would return true
if the given Node is the first of a list with a loop, and false
otherwise? How could you write so that it takes a constant amount of space and a reasonable amount of time?
Here's a picture of what a list with a loop looks like:
Source: (StackOverflow)
I believe this is another easy one for you LINQ masters out there.
Is there any way I can separe a List into several separate lists of SomeObject, using the item index as the delimiter of each split?
Let me exemplify:
I have a List<SomeObject>
and I need a List<List<SomeObject>>
or List<SomeObject>[]
, so that each of these resulting lists will contain a group of 3 items of the original list (sequentially).
eg.:
Original List: [a, g, e, w, p, s, q, f, x, y, i, m, c]
Resulting lists: [a, g, e], [w, p, s], [q, f, x], [y, i, m], [c]
I'd also need the resulting lists size to be a parameter of this function.
Is it possible??
Source: (StackOverflow)
Suppose we have two stacks and no other temporary variable.
Is to possible to "construct" a queue data structure using only the two stacks?
Source: (StackOverflow)
I'm sure there's a good reason, but could someone please explain why the java.util.Set
interface lacks get(int Index)
, or any similar get()
method?
It seems that sets are great for putting things into, but I can't find an elegant way of retrieving a single item from it.
If I know I want the first item, I can use set.iterator().next()
, but otherwise it seems I have to cast to an Array to retrieve an item at a specific index?
What are the appropriate ways of retrieving data from a set? (other than using an iterator)
I'm sure the fact that it's excluded from the API means there's a good reason for not doing this -- could someone please enlighten me?
EDIT:
Some extremely great answers here, and a few saying "more context". The specific scenario was a dbUnit test, where I could reasonably assert that the returned set from a query had only 1 item, and I was trying to access that item.
However, the question is more valid without the scenario, as it remains more focussed:
What's the difference between set and list.
Thanks to all for the fantastic answers below.
Source: (StackOverflow)
I was looking for a tree or graph data structure in C# but I guess there isn't one provided. An Extensive Examination of Data Structures Using C# 2.0 explains a bit about why. Is there a convenient library which is commonly used to provide this functionality? Perhaps through a strategy pattern to solve the issues presented in the article.
I feel a bit silly implementing my own tree, just as I would implementing my own ArrayList.
I just want a generic tree which can be unbalanced. Think of a directory tree. C5 looks nifty, but their tree structures seem to be implemented as balanced red-black trees better suited to search than representing a hierarchy of nodes.
Source: (StackOverflow)
As made clear in update 3 on this answer, this notation:
var hash = {};
hash[X]
does not actually hash the object X
; it actually just converts X
to a string (via .toString()
if it's an object, or some other built-in conversions for various primitive types) and then looks that string up, without hashing it, in "hash
". Object equality is also not checked - if two different objects have the same string conversion, they will just overwrite each other.
Given this - are there any efficient implementations of hashmaps in javascript? (For example, the 2nd Google result of javascript hashmap
yields an implementation which is O(n) for any operation. Various other results ignore the fact that different objects with equivalent string representations overwrite each other.
Source: (StackOverflow)