Reading from file, using Scheme
Maybe this will get you started.
(define (file->list-of-chars file)
(with-input-from-file file
(lambda ()
(let reading ((chars '()))
(let ((char (read-char)))
(if (eof-object? char)
(reverse chars)
(reading (cons char chars))))))))
The most recommended way to import text is to edit and save the file as a scheme file defining a variable:
(define data "the text in
mydata.scm here")
and then calling:
(load "mydata.scm")
Many times, not every data file can just be edited and saved as a scheme file, and while newlines are automatically escaped, double quotes cannot and this creates a problem when loading the file.
Some implementation specific techniques are:
;Chicken
(use utils)
(read-all "mydata.txt")
;Racket
(file->string "mydata.txt")
A more portable function is:
;works in chicken-csi and Racket
(define (readlines filename)
(call-with-input-file filename
(lambda (p)
(let loop ((line (read-line p))
(result '()))
(if (eof-object? line)
(reverse result)
(loop (read-line p) (cons line result)))))))
Running an executable compiled chicken-csc will give error due to read-line requiring an extra file.
The most portable way to read a file is this function:
;works in Chicken, Racket, SISC
;Read a file to a list of chars
(define (file->char_list path)
(call-with-input-file path
(lambda (input-port)
(let loop ((x (read-char input-port)))
(cond
((eof-object? x) '())
(#t (begin (cons x (loop (read-char input-port))))))))))
This function is reasonably fast and portable across implementations. All that is needed is to convert the char_list to a string.
The simplest way is:
;may not work if there is limit on arguments
(apply string (file->char_list "mydata.txt"))
The catch is some implementations have a limit on the number of arguments that can be passed to a function. A list of 2049 chars would not work in Chicken.
Another method is:
;works in Chicken, Racket
(foldr (lambda (x y) (string-append (string x) y)) "" (file->char_list "mydata.txt"))
The problems are: First, foldr is not universally recognized (SISC), though it could be defined. Second, this method is very slow due to appending each character.
I wrote the next two functions to slice up a list of chars into nested lists until the lowest level would not exceed a maximum argument count in Chicken. The third function traverses the nested char list and returns a string using string string-append:
(define (cleave_at n a)
(cond
((null? a) '())
((zero? n) (list '() a))
(#t
((lambda (x)
(cons (cons (car a) (car x)) (cdr x)))
(cleave_at (- n 1) (cdr a))))))
(define (cleave_binary_nest n a)
(cond
((equal? n (length a)) (list a))
(#t
((lambda (x)
(cond
((> (length (car x)) n) (map (lambda (y) (cleave_binary_nest n y)) x))
(#t x)))
(cleave_at (floor (/ (length a) 2)) a)))))
(define (binary_nest_char->string a)
(cond
((null? a) "")
((char? (car a)) (apply string a))
(#t (string-append
(binary_nest_char->string (car a)) (binary_nest_char->string (cdr a))))))
The function is called like this:
;Works in Racket, Chicken, SISC
;faster than foldr method (3x faster interpreted Chicken) (30x faster compiled Chicken) (125x faster Racket gui)
(binary_nest_char->string (cleave_binary_nest 2048 (file->char_list "mydata.txt")))
To reduce to alphabetic characters and space there are two more functions:
(define (alphaspace? x)
(cond
((and (char-ci>=? x #\a) (char-ci<=? x #\z)) #t)
((equal? x #\space) #t)
(#t #f)))
(define (filter pred lis)
; if lis is empty
(if (null? lis)
; return an empty list
'()
; otherwise, if the predicate is true on the first element
(if (pred (car lis))
; return the first element concatenated with the
; result of calling filter on the rest of lis
(cons (car lis) (filter pred (cdr lis)))
; otherwise (if the predicate was false) just
; return the result of filtering the rest of lis
(filter pred (cdr lis)))))
(define data (file->char_list "mydata.txt"))
(define data_alphaspace (filter alphaspace? data))
(define result (binary_nest_char->string (cleave_binary_nest 2048 data_alphaspace)))
This works on Racket, Chicken (interpreted and compiled), and SISC (Java). Each of those dialects should also work on Linux, Mac (OS X), and Windows.