I have been going through various sites and codes, but nothing seems to end my misery. Either they help to find and remove duplicates for a specific column or they remove only from the datatable, not the underlying database itself. I want to delete duplicate rows from table "table1" of my mdb file.
To make my requirements clearer:
by duplicate rows I mean rows that are not distinct. for eg, in the following example,only 3rd and 5th row are duplicates. And I want to delete any of them.
Name1 Name2 Name3
tom dick harry
tom dick mike
ann sara mike
sara ann mike
ann sara mike
The duplicate rows should be deleted from database with a button click as follows
private void button1_Click(object sender, EventArgs e)
{
deletedupes();
}
private void deletedupes()
{
OleDbConnection con = new OleDbConnection("PROVIDER=Microsoft.Jet.OLEDB.4.0; Data Source=C:\\hi.mdb");
DataSet ds = new DataSet();
OleDbDataAdapter da = new OleDbDataAdapter("select * from table1", con);
con.Open();
da.Fill(ds, "table1");
// what could be rest of the code??
}
Thanks in advance. Yes I'm a novice..
As none of the answers were satsifactory for me (I'm just a tad too novice to understand the succint and slightly technicalized way spoken here by more knowledgable and experienced people), i tried my own variant to get this done. I could not follow what to be done with commands like distinct
or set rowcount
or delete from
etc. Nowhere I could find a fully deployed code in an example. So I tried this. From scratch.
int id, k;
private void button2_Click(object sender, EventArgs e)
{
OleDbConnection con = new OleDbConnection("PROVIDER=Microsoft.Jet.OLEDB.4.0; Data Source=C:\\hi.mdb");
DataSet ds = new DataSet();
OleDbDataAdapter da = new OleDbDataAdapter("select * from table2", con);
con.Open();
da.Fill(ds, "table2");
for (int i = 0; i < ds.Tables["table2"].Rows.Count; i++)
{
DataRow row = ds.Tables["table2"].Rows[i];
k++;
for (int j = k; j < ds.Tables["table2"].Rows.Count; j++)
{
DataRow row2 = ds.Tables["table2"].Rows[j];
if (row.ItemArray.GetValue(1).ToString() == row2.ItemArray.GetValue(1).ToString())
{
if (row.ItemArray.GetValue(3).ToString() == row2.ItemArray.GetValue(3).ToString())
{
id = int.Parse(row2.ItemArray.GetValue(0).ToString());
deletedupes(id);
}
}
}
}
con.Close();
}
private void deletedupes(int num)
{
OleDbConnection con = new OleDbConnection("PROVIDER=Microsoft.Jet.OLEDB.4.0; Data Source=C:\\hi.mdb");
con.Open();
OleDbCommand c = new OleDbCommand("Delete from table2 where id =?", con);
c.Parameters.AddWithValue("id", num);
c.ExecuteNonQuery();
con.Close();
}
Edit: Sorry, I missed to say that I did use a unique column having a primary key to get this done. Nevertheless, this can be done without that as well. Just a matter of choice. And for unknown reasons, this method seems so fast too..