Reading from file, using Scheme

Maybe this will get you started.

(define (file->list-of-chars file)
  (with-input-from-file file
    (lambda ()
      (let reading ((chars '()))
        (let ((char (read-char)))
          (if (eof-object? char)
              (reverse chars)
              (reading (cons char chars))))))))

The most recommended way to import text is to edit and save the file as a scheme file defining a variable:

(define data "the text in
mydata.scm here")

and then calling:

(load "mydata.scm")

Many times, not every data file can just be edited and saved as a scheme file, and while newlines are automatically escaped, double quotes cannot and this creates a problem when loading the file.

Some implementation specific techniques are:

;Chicken
(use utils)
(read-all "mydata.txt")

;Racket
(file->string "mydata.txt")

A more portable function is:

;works in chicken-csi and Racket
(define (readlines filename)
  (call-with-input-file filename
    (lambda (p)
      (let loop ((line (read-line p))
                 (result '()))
        (if (eof-object? line)
            (reverse result)
            (loop (read-line p) (cons line result)))))))

Running an executable compiled chicken-csc will give error due to read-line requiring an extra file.

The most portable way to read a file is this function:

;works in Chicken, Racket, SISC
;Read a file to a list of chars
(define (file->char_list path)
 (call-with-input-file path
   (lambda (input-port)
     (let loop ((x (read-char input-port)))
       (cond 
        ((eof-object? x) '())
        (#t (begin (cons x (loop (read-char input-port))))))))))

This function is reasonably fast and portable across implementations. All that is needed is to convert the char_list to a string.

The simplest way is:

;may not work if there is limit on arguments
(apply string (file->char_list "mydata.txt"))

The catch is some implementations have a limit on the number of arguments that can be passed to a function. A list of 2049 chars would not work in Chicken.

Another method is:

;works in Chicken, Racket
(foldr (lambda (x y) (string-append (string x) y)) "" (file->char_list "mydata.txt"))

The problems are: First, foldr is not universally recognized (SISC), though it could be defined. Second, this method is very slow due to appending each character.

I wrote the next two functions to slice up a list of chars into nested lists until the lowest level would not exceed a maximum argument count in Chicken. The third function traverses the nested char list and returns a string using string string-append:

(define (cleave_at n a)
  (cond
   ((null? a) '())
   ((zero? n) (list '() a))
   (#t 
    ((lambda (x)
      (cons (cons (car a) (car x)) (cdr x)))
     (cleave_at (- n 1) (cdr a))))))

(define (cleave_binary_nest n a)
 (cond
  ((equal? n (length a)) (list a))
  (#t 
   ((lambda (x)
     (cond
      ((> (length (car x)) n) (map (lambda (y) (cleave_binary_nest n y)) x))
      (#t x)))
    (cleave_at (floor (/ (length a) 2)) a)))))

(define (binary_nest_char->string a)
 (cond
  ((null? a) "")
  ((char? (car a)) (apply string a))
  (#t (string-append
    (binary_nest_char->string (car a)) (binary_nest_char->string (cdr a))))))

The function is called like this:

;Works in Racket, Chicken, SISC
;faster than foldr method (3x faster interpreted Chicken) (30x faster compiled Chicken) (125x faster Racket gui)
(binary_nest_char->string (cleave_binary_nest 2048 (file->char_list "mydata.txt")))

To reduce to alphabetic characters and space there are two more functions:

(define (alphaspace? x)
 (cond
  ((and (char-ci>=? x #\a) (char-ci<=? x #\z)) #t)
  ((equal? x #\space) #t)
  (#t #f)))

(define (filter pred lis)
  ; if lis is empty
  (if (null? lis)
    ; return an empty list
    '()
    ; otherwise, if the predicate is true on the first element
    (if (pred (car lis))
      ; return the first element concatenated with the
      ; result of calling filter on the rest of lis
      (cons (car lis) (filter pred (cdr lis)))
      ; otherwise (if the predicate was false) just
      ; return the result of filtering the rest of lis
      (filter pred (cdr lis)))))

(define data (file->char_list "mydata.txt"))
(define data_alphaspace (filter alphaspace? data))
(define result (binary_nest_char->string (cleave_binary_nest 2048 data_alphaspace)))

This works on Racket, Chicken (interpreted and compiled), and SISC (Java). Each of those dialects should also work on Linux, Mac (OS X), and Windows.

Tags:

File Io

Scheme