regex - Removing uneven spaces from input file in ruby -


i have text file contains balance sheet information company. problem spacing uneven , data this

28/07/15                 2.85                                104,689.13 30/07/15                                 31,862.00           136,551.13 

the reason 2.85 on first line debit , second credit.

how can data in ruby 4 elements line credit being empty on first , debit on second.

i can split data based on multiple spaces , compare balance between successive lines credit vs debit information want know if there better way(maybe regex) this.

thank you.

here's way work if lines messed up. relies on fact debits (credits) reduce (increase) balance amount of debit (credit). let's first write data file:

data =<<_ 28/07/15                 2.85  104,689.13 30/07/15        31,862.00                                    136,551.13                                  28/07/15 1.13 136,550.00 30/07/15                                 10,000.01           146,550.01 _  fname = 'temp' io.write(fname, data)   #=> 288 

the method extracting fields follows. requires file name , starting balance. alternatively, second argument boolean indicating whether first line contains debit or credit.

require 'bigdecimal'  def extract_transactions(fname, starting_balance)   transactions = []   io.readlines(fname).reduce(bigdecimal.new(starting_balance)) |start_bal,s|     date, debit_or_credit, bal = s.strip.delete(',').split(/\s+/)     h = { date: date, debit: '', credit: '', balance: bal }     if bigdecimal.new(bal) == start_bal - bigdecimal.new(debit_or_credit)       h[:debit] = debit_or_credit     else       h[:credit] = debit_or_credit     end     transactions << h     bigdecimal.new(bal)   end   transactions           end 

let's try it:

extract_debits_and_credits(fname, "104691.98")   #=> [{:date=>"28/07/15", :debit=>"2.85", :credit=>"", :balance=>"104689.13"},   #    {:date=>"30/07/15", :debit=>"", :credit=>"31862.00", :balance=>"136551.13"},   #    {:date=>"28/07/15", :debit=>"1.13", :credit=>"", :balance=>"136550.00"},   #    {:date=>"30/07/15", :debit=>"", :credit=>"10000.01", :balance=>"146550.01"}] 

i used bigdecimal avoid problems round-off errors.

enumerable#reduce (aka inject) updates balance (start_bal, starting_balance) after each transaction (row).

edit: here's non-bigdecimal variant (that's better):

def extract_transactions(fname, debit_first)   curr_bal = (debit_first ? float::infinity : -float::infinity)   io.readlines(fname).each_with_object([]) |s, transact|     date, debit, bal = s.strip.split(/\s+/)     credit = ''     bal_float = bal.delete(',').to_f     (debit, credit = credit, debit) if bal_float > curr_bal     transact << { date: date, debit: debit, credit: credit, balance: bal }     curr_bal = bal_float   end end  extract_transactions(fname, true)   #=> [{:date=>"28/07/15", :debit=>"2.85", :credit=>"", :balance=>"104689.13"},   #    {:date=>"30/07/15", :debit=>"", :credit=>"31862.00", :balance=>"136551.13"},   #    {:date=>"28/07/15", :debit=>"1.13", :credit=>"", :balance=>"136550.00"},   #    {:date=>"30/07/15", :debit=>"", :credit=>"10000.01", :balance=>"146550.01"}] 

Comments