indexing interview questions
Top indexing frequently asked interview questions
While studying for the 70-433 exam I noticed you can create a covering index in one of the following two ways.
CREATE INDEX idx1 ON MyTable (Col1, Col2, Col3)
-- OR --
CREATE INDEX idx1 ON MyTable (Col1) INCLUDE (Col2, Col3)
The INCLUDE clause is new to me. Why would you use it and what guidelines would you suggest in determining whether to create a covering index with or without the INCLUDE clause?
Source: (StackOverflow)
I have a limited exposure to DB and have only used DB as an application programmer. I want to know about Clustered and Non clustered indexes.
I googled and what I found was :
A clustered index is a special type of index that reorders the way
records in the table are physically
stored. Therefore table can have only
one clustered index. The leaf nodes
of a clustered index contain the data
pages. A nonclustered index is a
special type of index in which the
logical order of the index does not
match the physical stored order of
the rows on disk. The leaf node of a
nonclustered index does not consist of
the data pages. Instead, the leaf
nodes contain index rows.
What I found in SO was What are the differences between a clustered and a non-clustered index?.
Can someone explain this in plain English?
Source: (StackOverflow)
I have a list and I want to remove a single element from it. How can I do this?
I've tried looking up what I think the obvious names for this function would be in the reference manual and I haven't found anything appropriate.
Source: (StackOverflow)
For example, if I want to read the middle value from magic(5)
, I can do so like this:
M = magic(5);
value = M(3,3);
to get value == 13
. I'd like to be able to do something like one of these:
value = magic(5)(3,3);
value = (magic(5))(3,3);
to dispense with the intermediate variable. However, MATLAB complains about Unbalanced or unexpected parenthesis or bracket
on the first parenthesis before the 3
.
Is it possible to read values from an array/matrix without first assigning it to a variable?
Source: (StackOverflow)
What are the differences between PRIMARY, UNIQUE, INDEX and FULLTEXT when creating MySQL tables?
How would I use them?
Source: (StackOverflow)
In R, I have an element x
and a vector v
. I want to find the first index of an element in v
that is equal to x
. I know that one way to do this is: which(x == v)[[1]]
, but that seems excessively inefficient. Is there a more direct way to do it?
For bonus points, is there a function that works if x
is a vector? That is, it should return a vector of indices indicating the position of each element of x
in v
.
Source: (StackOverflow)
What is the easiest way to convert
[x1, x2, x3, ... , xN]
to
[[x1, 2], [x2, 3], [x3, 4], ... , [xN, N+1]]
Source: (StackOverflow)
Is there any way I can get the actual row number from a query?
I want to be able to order a table called league_girl by a field called score; and return the username and the actual row position of that username.
I'm wanting to rank the users so i can tell where a particular user is, ie. Joe is position 100 out of 200, i.e.
User Score Row
Joe 100 1
Bob 50 2
Bill 10 3
I've seen a few solutions on here but I've tried most of them and none of them actually return the row number.
I have tried this:
SELECT position, username, score
FROM (SELECT @row := @row + 1 AS position, username, score
FROM league_girl GROUP BY username ORDER BY score DESC)
As derived
...but it doesn't seem to return the row position.
Any ideas?
Source: (StackOverflow)
Suppose I have 2 tables, Products and ProductCategories. Both tables have relationship on CategoryId. And this is the query.
select p.ProductId, p.Name, c.CategoryId, c.Name AS Category
from Products p inner join ProductCategories c on p.CategoryId = c.CategoryId
where c.CategoryId = 1;
When I create execution plan, table ProductCategories performs cluster index seek, which is as expectation. But for table Products, it performs cluster index scan, which make me doubt. Why FK does not help improve query performance?
So I have to create index on Products.CategoryId. When I create execution plan again, both tables perform index seek. And estimated subtree cost is reduced a lot.
My questions are:
Beside FK helps on relationship constraint, does it have any other usefulness? Does it improve query performance?
Should I create index on all FK columns (liked Products.CategoryId) in all tables?
Source: (StackOverflow)
I know how to use INDEX as in the following code. And I know how to use foreign key and primary key.
CREATE TABLE tasks (
task_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
parent_id INT UNSIGNED NOT NULL DEFAULT 0,
task VARCHAR(100) NOT NULL,
date_added TIMESTAMP NOT NULL,
date_completed TIMESTAMP,
PRIMARY KEY (task_id),
INDEX parent (parent_id),
....
However I found a code using KEY instead of INDEX as following.
...
KEY order_date (order_date)
...
I am not able to find any document in MySQL official page.
Could anyone tell me what the differences are between KEY and INDEX?
What I can see the difference is that when I uses KEY ..., I need to repeat the word, e.g. KEY order_date (order_date).
Source: (StackOverflow)
I've been using indexes on my MySQL databases for a while now but never properly learnt about them. Generally I put an index on any fields that I will be searching or selecting using a WHERE
clause but sometimes it doesn't seem so black and white.
What are the best practices for MySQL indexes?
Example situations/dilemmas:
If a table has six columns and all of
them are searchable, should I index
all of them or none of them?
.
What are the negative performance
impacts of indexing?
.
If I have a VARCHAR 2500 column which
is searchable from parts of my site,
should I index it?
Source: (StackOverflow)
Suppose I have a table of customers and a table of purchases. Each purchase belongs to one customer. I want to get a list of all customers along with their last purchase in one SELECT statement. What is the best practice? Any advice on building indexes?
Please use these table/column names in your answer:
- customer: id, name
- purchase: id, customer_id, item_id, date
And in more complicated situations, would it be (performance-wise) beneficial to denormalize the database by putting the last purchase into the customer table?
If the (purchase) id is guaranteed to be sorted by date, can the statements be simplified by using something like LIMIT 1
?
Source: (StackOverflow)