Top duplicates frequently asked interview questions

How to remove duplicate values from an array in PHP

How can I remove duplicate values from an array in PHP?

How to remove duplicate values from a multi-dimensional array in PHP

How can I remove duplicate values from a multi-dimensional array in PHP?

Example array:

Array
(
    [0] => Array
	(
	    [0] => abc
	    [1] => def
	)

    [1] => Array
	(
	    [0] => ghi
	    [1] => jkl
	)

    [2] => Array
	(
	    [0] => mno
	    [1] => pql
	)

    [3] => Array
	(
	    [0] => abc
	    [1] => def
	)

    [4] => Array
	(
	    [0] => ghi
	    [1] => jkl
	)

    [5] => Array
	(
	    [0] => mno
	    [1] => pql
	)

)

Source: (StackOverflow)

Removing duplicate rows from table in Oracle

I'm testing something in Oracle and populated a table with some sample data, but in the process I accidentally loaded duplicate records, so now I can't create a primary key using some of the columns.

How can I delete all duplicate rows and leave only one of them?

Source: (StackOverflow)

How do I (or can I) SELECT DISTINCT on multiple columns?

I need to retrieve all rows from a table where 2 columns combined are all different. So I want all the sales that do not have any other sales that happened on the same day for the same price. The sales that are unique based on day and price will get updated to an active status.

So I'm thinking:

UPDATE sales
SET status = 'ACTIVE'
WHERE id IN (SELECT DISTINCT (saleprice, saledate), id, count(id)
             FROM sales
             HAVING count = 1)

But my brain hurts going any farther than that.

Source: (StackOverflow)

Find duplicate lines in a file and count how many time each line was duplicated?

Suppose I have a file similar to the following:

I would like to find how many times '123' was duplicated, how many times '234' was duplicated, etc. So ideally, the output would be like:

123  3 
234  2 
345  1

Source: (StackOverflow)

Remove duplicate rows in MySQL

I have a table with the following fields:

id (Unique)
url (Unique)
title
company
site_id

Now, I need to remove rows having same title, company and site_id. One way to do it will be using the following SQL along with a script (PHP):

SELECT title, site_id, location, id, count( * ) 
FROM jobs
GROUP BY site_id, company, title, location
HAVING count( * ) >1

After running this query, I can remove duplicates using a server side script. But, I want to know if this can be done only using SQL query.

Source: (StackOverflow)

How can I remove duplicate rows?

What is the best way to remove duplicate rows from a fairly large table (i.e. 300,000+ rows)?

The rows of course will not be perfect duplicates because of the existence of the RowID identity field.

MyTable
-----------
RowID int not null identity(1,1) primary key,
Col1 varchar(20) not null,
Col2 varchar(2048) not null,
Col3 tinyint not null

Source: (StackOverflow)

Find duplicate records in MySQL

I want to pull out duplicate records in a MySQL Database. This can be done with:

SELECT address, count(id) as cnt FROM list
GROUP BY address HAVING cnt > 1

Which results in:

100 MAIN ST    2

I would like to pull it so that it shows each row that is a duplicate. Something like:

JIM    JONES    100 MAIN ST
JOHN   SMITH    100 MAIN ST

Any thoughts on how this can be done? I'm trying to avoid doing the first one then looking up the duplicates with a second query in the code.

Source: (StackOverflow)

Python removing duplicates in lists

So pretty much I need to write a program to check if a list has any duplicates and if it does it removes them and returns a new list with the items that werent duplicated/removed. This is what I have but to be honest I do not know what to do.

def remove_duplicates():
    t = ['a', 'b', 'c', 'd']
    t2 = ['a', 'c', 'd']
    for t in t2:
        t.append(t.remove())
    return t

Source: (StackOverflow)

How do I remove repeated elements from ArrayList?

I have an ArrayList of Strings, and I want to remove repeated strings from it. How can I do this?

Source: (StackOverflow)

MySQL remove duplicates from big database quick

I've got big (>Mil rows) MySQL database messed up by duplicates. I think it could be from 1/4 to 1/2 of the whole db filled with them. I need to get rid of them quick (i mean query execution time). Here's how it looks:
id (index) | text1 | text2 | text3
text1 & text2 combination should be unique, if there are any duplicates, only one combination with text3 NOT NULL should remain. Example:

1 | abc | def | NULL  
2 | abc | def | ghi  
3 | abc | def | jkl  
4 | aaa | bbb | NULL  
5 | aaa | bbb | NULL

...becomes:

1 | abc | def | ghi   #(doesn't realy matter id:2 or id:3 survives)   
2 | aaa | bbb | NULL  #(if there's no NOT NULL text3, NULL will do)

New ids cold be anything, they do not depend on old table ids.
I've tried things like:

CREATE TABLE tmp SELECT text1, text2, text3
FROM my_tbl;
GROUP BY text1, text2;
DROP TABLE my_tbl;
ALTER TABLE tmp RENAME TO my_tbl;

Or SELECT DISTINCT and other variations.
While they work on small databases, query execution time on mine is just huge (never got to the end, actually; > 20 min)

Is there any faster way to do that? Please help me solve this problem.

Source: (StackOverflow)

Algorithm: efficient way to remove duplicate integers from an array

I got this problem from an interview with Microsoft.

Given an array of random integers, write an algorithm in C that removes duplicated numbers and return the unique numbers in the original array.

E.g Input: {4, 8, 4, 1, 1, 2, 9} Output: {4, 8, 1, 2, 9, ?, ?}

One caveat is that the expected algorithm should not required the array to be sorted first. And when an element has been removed, the following elements must be shifted forward as well. Anyway, value of elements at the tail of the array where elements were shifted forward are negligible.

Update: The result must be returned in the original array and helper data structure (e.g. hashtable) should not be used. However, I guess order preservation is not necessary.

Update2: For those who wonder why these impractical constraints, this was an interview question and all these constraints are discussed during the thinking process to see how I can come up with different ideas.

Source: (StackOverflow)

How to find a duplicate element in an array of shuffled consecutive integers?

I recently came across a question somewhere:

Suppose you have an array of 1001 integers. The integers are in random order, but you know each of the integers is between 1 and 1000 (inclusive). In addition, each number appears only once in the array, except for one number, which occurs twice. Assume that you can access each element of the array only once. Describe an algorithm to find the repeated number. If you used auxiliary storage in your algorithm, can you find an algorithm that does not require it?

What I am interested in to know is the second part, i.e., without using auxiliary storage. Do you have any idea?

Source: (StackOverflow)

Linux command or script counting duplicated lines in a text file?

If I have a text file with the following conent

red apple
green apple
green apple
orange
orange
orange

Is there a Linux command or script that I can use to get the following result?

1 red apple
2 green apple
3 orange

Source: (StackOverflow)

C# LINQ find duplicates in List

Using LINQ, from a List<int>, how can I retrieve a list that contains entries repeated more than once and their values?

Source: (StackOverflow)

EzDevInfo.com

duplicates interview questions

How to remove duplicate values from an array in PHP

How to remove duplicate values from a multi-dimensional array in PHP

Removing duplicate rows from table in Oracle

How do I (or can I) SELECT DISTINCT on multiple columns?

Find duplicate lines in a file and count how many time each line was duplicated?

Remove duplicate rows in MySQL

How can I remove duplicate rows?

Find duplicate records in MySQL

Python removing duplicates in lists

How do I remove repeated elements from ArrayList?

MySQL remove duplicates from big database quick

Algorithm: efficient way to remove duplicate integers from an array

How to find a duplicate element in an array of shuffled consecutive integers?

Linux command or script counting duplicated lines in a text file?

C# LINQ find duplicates in List