c# - How to Speed Up Reading a File using FileSteam -


i facing performance issue while searching content of file. using filestream class read files (~10 files involved each search each being ~70 mb in size). however, of these files simultaneously being accessed , updated process during search. such, cannot use buffersize reading files. using buffer size in streamreader takes 3 minutes though using regex.

has come across similar situation , offer pointers on improving performance of file search?

code snippet

  private static int buffersize = 32768;   using (filestream fs = file.open(filepath, filemode.open, fileaccess.read, fileshare.readwrite))         {              using (textreader txtreader = new streamreader(fs, encoding.utf8, true, buffersize))              {                 system.text.regularexpressions.regex patternmatching = new system.text.regularexpressions.regex(@"(?=\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2})(.*?)(?=\n\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2})", system.text.regularexpressions.regexoptions.ignorecase);                 system.text.regularexpressions.regex datestringmatch = new regex(@"^\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}");                 char[] temp = new char[1048576];                 while (txtreader.readblock(temp, 0, 1048576) > 0)                 {                     stringbuilder parsestring = new stringbuilder();                     parsestring.append(temp);                     if (temp[1023].tostring() != environment.newline)                     {                         parsestring.append(txtreader.readline());                         while (txtreader.peek() > 0 && !(txtreader.peek() >= 48 && txtreader.peek() <= 57))                         {                             parsestring.append(txtreader.readline());                         }                     }                     if (parsestring.length > 0)                     {                         string[] allrecords = patternmatching.split(parsestring.tostring());                         foreach (var item in allrecords)                         {                              var contentstring = item.trim();                             if (!string.isnullorwhitespace(contentstring))                             {                                 var matches = datestringmatch.matches(contentstring);                                 if (matches.count > 0)                                 {                                      var rowdatetime = datetime.minvalue;                                     if (datetime.tryparse(matches[0].value, out rowdatetime))                                     {                                         if (rowdatetime >= startdate && rowdatetime < enddate)                                         {                                             if (contentstring.tolowerinvariant().contains(searchtext))                                             {                                                 var result = new searchresult                                                 {                                                     logfiletype = logfiletype,                                                     message = string.format(messagetemplatenew, item),                                                     timestamp = rowdatetime,                                                     componentname = componentname,                                                     filename = filepath,                                                     servername = servername                                                 };                                                 searchresults.add(result);                                             }                                         }                                     }                                  }                              }                         }                     }                 }             }         }          return searchresults; 

some time ago had analyse many filezilla server logfiles each >120mb. used simple list lines of each logfile , had great performance searching specific lines.

list<string> filecontent = file.readalllines(pathtofile).tolist() 

but in case think main reason bad performance isn't reading file. try stopwatch parts of loop check time spent. regex , tryparse can time consuming if used many times in loop yours.


Comments