扫描DataTable的速度测试

来源:互联网 发布:mysql5.7 修改端口号 编辑:程序博客网 时间:2024/05/16 21:58
近日要做了两个小项目,都是针对大数据表(mssql 一般在千W以上)进行扫描,收集信息.

自我感觉对于全表扫描也有了小小的感觉.现将其总结于下.

功能介绍:假设内存中加载了1000条数据,仅有字段UrlPath (varchar),实现的功能等于sql语句:select urlpath,count(urlpath) from urldata group by urlpath ,勿喷,向下看.

个人使用的两种方法:

1,使用字段标记(select top 1000 urlpath,0 as Dealed from urldata)

2.使用Remove将处理过的DataRow移出DataTable

我要的是测试他们的运算时间是多少!!!

下面附上代码:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
using System.Data;
using System.Data.SqlClient;

namespace DBTest
{
class Program
{
private static DataTable table;

static void Main(string[] args)
{
GetTable();
Start();
Console.Read();
}

static void GetTable()
{
Stopwatch sw = new Stopwatch();
sw.Start();
string connstr = "Data Source=localhost;Initial Catalog=BigTable;Persist Security Info=True;User ID=***;Password=***";
SqlConnection conn = new SqlConnection(connstr);

SqlCommand command = new SqlCommand("select top 1000 * ,0 as dealed from urldata ",conn);
SqlDataAdapter sda = new SqlDataAdapter(command);
DataSet ds = new DataSet();
sda.Fill(ds);
table = ds.Tables[0];
sw.Stop();
Console.WriteLine("加载数据:{0}",sw.ElapsedMilliseconds);
}

static void Start()
{
Stopwatch sw = new Stopwatch();
sw.Start();
SignTest();
sw.Stop();
Console.WriteLine("字段标记:{0}",sw.ElapsedMilliseconds);

sw.Reset();
sw.Start();
RemoveTest();
sw.Stop();
Console.WriteLine("记录移出:{0}",sw.ElapsedMilliseconds);

}

static void RemoveTest()
{
DataRow[] rows;
string urlpath=string.Empty;

while (table.Rows.Count > 0)
{
//Console.Write(table.Rows.Count);
rows=null;
urlpath=table.Rows[0]["UrlPath"].ToString();
//将使用hashtable/Dictionary来处理
rows=table.Select("urlpath='"+urlpath+"'");
foreach(DataRow row in rows)
{
table.Rows.Remove(row);
}
}
}

static void SignTest()
{
DataRow[] rows;
string urlpath = string.Empty;
foreach (DataRow row in table.Rows)
{
if (row["dealed"].ToString() == "1")
continue;
urlpath = row["Urlpath"].ToString();
rows = table.Select("urlpath='" + urlpath + "'");

foreach (DataRow row2 in rows)
{
row2["dealed"] = 1;
}
}
}
}
}

现在看一下执行结果

到了这里,请容我说句脏话,好JB慢.

内容到此结束.

在此厚颜求高手指教更好的全表扫描的方法,只用SQL完成不行(1000W的记录分个组就搞死人家数据库了难过),只要有理,请喷!!!

转载请注明出处.


原创粉丝点击