Saying when: handling tense in conversation

Saying when: handling tense in conversation

Written by Dylan Holmes

Table of Contents

1 Vision

! Note: You can run this code in your browser using the REPL in the section describing What you can do.

Download Clojure source (GPL 3+)

1.1 What the program can do

I'll introduce my question-answering project by giving an example that demonstrates what it does. The scenario is as follows:

Scenario: Someone's hosting a party with a number of guests who are expected to arrive in a certain order1:

timeline-1.png

In other words, Ash will arrive first, followed by Bertie, Charlie, David, Derek, Ed, Fred, George, and Harley. One of the invited guests, Zelda, won't be coming.

Before the party begins, you decide to have a conversation with the host. Here's a transcript. (The computer program is assuming the role of party host.):

timeline-3.png

Is Harley coming?
> YES.

Is Ash?
> YES.

Is Zelda?
> NO; SHE NEVER WILL.

Did Ash arrive safely?
> SINCE WHEN?

When David gets here, will Bertie have arrived?
> YES.

Will Ed?
> NO. NOT BY THEN.

Will Charlie just have?
> YES.

Will Harley turn up after George?
> YES.

Just after?
> YES.

Will Ash?
> NO.

When Zelda gets here, will Ed have arrived?
> ZELDA NEVER WILL

Later in the party — after Charlie arrives, but before David does — you have a second conversation:

timeline-2.png

Is Ash coming?
> NO; SHE ALREADY HAS.

Is Ed?
> YES.

When Fred gets here, will George have arrived?
> NO; NOT BY THEN.

Has Fred arrived since Ed got here?
> ED HASN'T YET.

This performance has several noteworthy features:

  1. It's about time. The program can understand questions about temporal events, and can reply using appropriately-tensed responses.
  2. The conversation accumulates context Notice how naturally the conversation flows; the question "When David gets here, will Bertie have arrived?" is followed immediately afterward by "Will Ed?" (rather than the more explicit "[When David gets here, ]will ed[ have arrived]?"). The program makes such everyday ellision possible by storing the conversational context.
  3. The program handles mistaken questions. There are questions whose answer is no ("Will Ed?" "NO."), but more importantly there are questions whose premises are mistaken ("Has Fred arrived since Ed got here?" "ED HASN'T YET" or "When Zelda gets here, will Ed have arrived?" "ZELDA NEVER WILL"). One of the primary goals of this project was to design a system that gracefully handles questions with malformed assumptions; after all, we humans readily know what to say when such questions occur in everyday life.
  4. The responses say just enough. Notice how the computer responds to "Will Ed [have arrived]?" ("NO; NOT BY THEN") — it supplies the extra phrase "NOT BY THEN" to reassure us that Ed will, in fact, be coming at some point after David. Or compare the response to "Is Zelda [coming]?" ("NO; SHE NEVER WILL") to "When Zelda gets here, will Ed have arrived?" ("ZELDA NEVER WILL"). Here, a pronoun will do in the first case but not in the second; in general, a name is required to distinguish which of the two people (Ed or Zelda) will never arrive. 2 See also: "Has Fred arrived since Ed got here?" "ED HASN'T YET".
  5. Replies are generated, not canned. One underlying feature of this program is that it assembles its replies from semantic information about tense, aspect, polarity, agent, and so on; responses are not simply looked up in a table. Thus an additional feature of this program is its ability to generate tensed english sentences from semantic desiderata.

1.2 What I aimed to do

In ordinary conversation, we humans are adept at understanding and answering questions about what has, or might, happen. For this project, I set out to create a computer program that can converse naturally about what has happened when.

From the beginning, I was inspired by Winograd's work in blocks world and Longuet-Higgin's book Mental Processes (on which this project is based). I wanted first of all to make a language-understanding system that didn't simply produce trees out of sentences (which statistical parsers can do even on nonsensical input), but rather one which used language as part of acting competently in a particular domain.

In short, I wanted language-understanding to serve a particular purpose, to be directed toward a particular goal.

When I selected time and tense as the domain of expertise, several more goals presented themselves: At a high level, I wanted the computer to converse "naturally". This led to two concrete goals: I wanted the system to behave conversationally, with a reasonable sense of where the conversation has been, and of what words can be elided. And I wanted the system especially to handle malformed questions — questions which are innocuously well-formed, but at odds with the facts. To handle those questions, I felt, would be the mark of graceful, conversant question-answering.

As far as these goals are concerned, I've handled them pretty well. The system still has a few bugs in places — situations which it handles differently than its human counterparts — but it shows off an impressive command of temporal information and conversational constraint.

I had a few additional goals which I haven't been able to achieve yet, but which still seem possible and useful. In particular, I wanted to explore how the simple nature of tense structure (being defined in terms of an occurence time, reference time, and utterance time) made it easy to learn. I wanted to teach, through something like near-miss learning, the transduction function that maps a question and the facts at hand onto the semantic qualities of its answer. Such advancements would make language-learning and evolution into a part of this project; I intend to include them in future incarnations of this work.

1.3 What you can do

You can download the source arrivals.clj, or use the web-based REPL below. » Open REPL in its own window.

1.3.1 Run the demo with run-demo

To run the demo once you're in the ai.logical.arrivals namespace, simply call (run-demo) from the REPL.

(run-demo)

The program will run through an automatic sequence of questions (the ones shown in the introduction) to demonstrate its capabilities.

Once you have a taste for how the program works, you can try out questions or statements of your own using the methods in the next two sections.

1.3.2 Ask questions with ask

Asking a question is as straightforward as calling ask with your question as a string argument. For example:

(ask "Will Ash have arrived when Bertie gets here?")

Note that the time of conversation is automatically set to before the party, and the conversation accumulates context until reset either manually or by conversational cues.

To set a new time for the conversation, use, for example:

(set-current-time! 3.5)

You can pass any real number as an argument. With this representation, one guest arrives at each positive integer between 1 and 9 — so 3.5 corresponds to the time after Charlie arrives but before Derek does, as in the second demo conversation.

To reset the conversational context, you can use

(forget!)

which causes the program to lose track of the current thread of conversation. Alternatively, to revert the entire program to its initial state (i.e. to reset the current time and the conversational context), you can use

(restart!)

This will return the program to its initial state, including the conversational time.

1.3.3 Generate English with say

To generate English with say, you pass a hash-map containing the features you want in your reply.

The complete set of possibilities are:

tense
Any of the keywords :past, :present, or :future. Assumes present tense by default.
aspect
Either of the keywords :perfective, :progressive. Assumes progressive aspect by default.
sense
Only the keyword :immediate, to indicate that the questions is whether someone will just have arrived, etc.
who
The subject of the sentence, i.e. a string containing a name.
negated
If true, the response will have negative polarity (e.g. "isn't coming"). Otherwise, the response will have positive polarity (e.g. "is coming").
reply-yes
If true, says YES before replying. If false, says NO before replying. If missing or nil, just replies. (This is distinct from polarity, as you can reasonable say "YES; SHE'S NOT COMING." or "NO; SHE WILL BE COMING AFTER CHARLIE".

Here's an example:

(say {:who "Dylan", :negated true, :tense :future, :aspect :perfective })
;; DYLAN WON'T HAVE BY THEN

2 How the program answers a question

2.1 Determine what words mean (lexicalization)

The program neither parses questions into a complete tree structure, nor treats the sentence as a bag of keywords. Instead, the program identifies all the words that it knows, and then groups those words into clauses based on certain heuristics.

First, the program looks up each word in its dictionary, which contains the names of all expected guests as well as auxillary verbs such as "had" and conjunctions such as "after". Each recognized word is replaced with a dictionary of its features, and unrecognized words are ignored. (In particular, the program knows no verbs at all — so a sentence such as "After Charlie arrives, will David show up?" is equivalent to "AFTER CHARLIE, WILL DAVID?". It's an impressive fact of our language that these words are all you need for determining timing information.)

Words in the lexicon may have some of the following features:

conjunction
a part-of-speech label attached to words like "after", "since", "when", "before".
who
a part-of-speech label indicating that the word denotes a person's name.
question
a part-of-speech label indicating that a word is an auxillary verb, and therefore can be used to introduce a question.

Question words also have information about the timing information they provide:

tense
(marking a time of reference). Tense may be "past" (reference time before now), present (reference time is now), or future (reference time after now).
aspect
(describing how an event relates to the reference time). Aspect may be "perfective" (happening before the reference time), or "progressive" (happening after).

Words appear at most one time in the dictionary, so there's no lexical ambiguity in this program (i.e. words that have more than one possible meaning.)

There is exactly one auxillary verb for each combination of tense and aspect (3 × 2 = 6).

(defn identify-words
  "Looks up the features of each word in the statement, returning a
list of hash maps. If a word is not known, it is removed from the sentence."
  [s]
   (let [lexicon
         {"when" {:conjunction true} ;; note: not used as question word
          "after" {:conjunction true}

          ;; note: "since" can occur in any tense, provided aspect is perfective 
          "since" {:conjunction true :requires-perfective true}
          "will" {:question true
                    :tense :future},
         "has" {:question true
                :tense :present
                :aspect :perfective
                }, ;; progressive?
         "did" {:question true
                  :tense :past},
         "is" {:question true,
               :tense :present
               :aspect :progressive}
         "just" {:sense :immediate}, ;; hack
         "have" {:aspect :perfective}
         "had" {
                :aspect :perfective
                :requires-perfective true
                :question true :tense :past ;; recent addition

                }
          }]
     (remove nil?
             (for [word (map lc (str/split s #"\s+"))]
               (cond (some #{word} people)
                     {:who word}
                     (get lexicon word)
                     (lexicon word)
                     )


               ))
     ))

2.1.1 Table of Verbs

  Past tense Present tense Future tense
Perfective aspect HAD HAS WILL HAVE
Progressive aspect DID IS WILL

2.2 Divide the sentence into parts

I designed the program to handle questions with one or two clauses (e.g. "Is — coming?" or "Will — arrive after — ?".) After the program determines the meaning of each word, it groups the words into clauses. To accomplish this grouping, I developed the following heuristic which seems to work well in all cases I've tried:

  1. Clauses should contain as many words as possible. (In particular, read the words in order from left to right and put them all into the same clause unless doing so would violate one of the following constraints.)
  2. Clauses should contain at most one question word.
  3. Clauses should contain at most one conjunction word.
(defn build-clauses
  "Given a list of maps (as produced by identify-words), consolidate
  them into a shorter list of maps that each contain at most one question-word and at
  most one conjunction-word."
  [ms]
  (loop [closed []
         open ms
         clause nil]
    (if-let [m (first open)]

      (cond (nil? clause)
            (recur closed (rest open) m)

            (or (:question m) (:conjunction m)
                (and (:aspect clause) (:aspect m))
                )
            (recur (conj closed clause)
                   (rest open)
                   m)
            :else
            (recur closed
                   (rest open)
                   (merge clause m)))
      (if-not (nil? clause)
        (conj closed clause) closed)
      ))
  )

2.3 Fill in missing information using prior conversation

In casual speech, we often omit otherwise essential pieces of information when they can be "inferred from context". The variable contextual-memory implements this behavior by keeping track of relevant information from earlier in the conversation.

Here, contextual memory keeps track of the reference time, tense, aspect, immediacy, and people most recently mentioned. If any of these attributes are missing from a question, they are filled in from context.

Hence you can natually ask a sequence of two questions:

(ask "Will Harley show up just after Fred?")
;; ["Will Harley show up just after Fred?" "NO."]

(ask "Will George?")
;; ["Will george?" "YES."]

rather than the more explicit "Will George show up just after Fred?", which is implied.

This contextual memory is managed by the ask function. The ask function parses the sentence into clauses (as described above), fills in gaps by looking up the contextual memory, calls the evaluate! subroutine to answer the question, then stores the information in memory before returning the answer.

(defn ask
  "Ask a question, possibly specifying the time of utterance. If no
time is specified, uses @now."
  ([when s]
     (dosync
      (ref-set now when)
      (let [s* (remove-punctuation s)
            clauses (build-clauses (identify-words s*))
            main-clause (some identity
                              (remove has-reference-time clauses))

            new-reference-clause (some has-reference-time clauses)
            ]

        (cond
         ;; If the sentence contains a reference clause such as "After
         ;; X arrived, ...", reset the contextual memory and set the
         ;; reference time to the reference clause.

         new-reference-clause
         (if (:who new-reference-clause)
           (do
             (ref-set reference-time
                      (merge @reference-time new-reference-clause))
             (ref-set contextual-memory nil)

             ))


         ;; Otherwise, if the sentence is in present tense, set the
         ;; reference time to the moment of utterance.

         (= :present
            (get (merge @contextual-memory main-clause) :tense))
         (do (ref-set reference-time {:who :now})
             (ref-set contextual-memory nil))
         )

        (let [e (evaluate! main-clause)]
          ;; Store the current clause in contextual memory for the next question.
          (ref-set contextual-memory
                   (merge @contextual-memory main-clause))

          [s (first (apply say (take 2 e)))]

          )

        )))
  ([s]
     (ask @now s)))

2.4 Determine whether the question is right, wrong, or misguided

The evaluate! subroutine performs the main work of the question-answering system. Its task is to determine whether the questioner makes any erroneous assumptions (like asking about someone who will never come, or asking when someone will show up despite the fact that the person has already arrived) and reply gracefully.

If the question is not faulty, then the evaluate! function uses tense and aspect information to check whether the answer to the question is Yes or No.

Here is a rough flowchart:

  1. Has a reference time been defined? (A reference time is specified by including a clause like "After …", or "When …".

    1. If reference time has not been defined, then the question must use the present tense (or else it isn't a well-defined question; the computer will complain and ask "Since when?".)
    2. Does the subject in question actually show up at some point? If not, the questioner has made a faulty assumption. ("… never will.")
    3. Does the subject show up at the proper time, according to the verb aspect? There are exactly two types of question — "Has … arrived?" and "Is … coming?". They require the subject to be before or after the current moment, respectively, which determines whether the answer is YES or NO.
  2. If a reference time has been defined, then the procedure is the same as above, except we also must check if the person used as a reference time ever shows up, and check the verb tense along with verb aspect.

(defn evaluate! [clause]

  (let [clause* (merge @contextual-memory clause)

        response (-> clause*
                     (dissoc :question)
                     (assoc :response :response))

        ;; (< ref now) for perfective, (> ref now) for progressive
        aspect≼ (if (= :perfective (:aspect clause*)) < >)

        aspect≼ (if (and (:requires-perfective @reference-time)
                         (= :present (:tense clause*))
                         (= :perfective (:aspect clause*)))
                  > aspect≼)



        ;; (< event ref) for past, (> event ref) for future
        tense≼ (cond (= :past (:tense clause*)) <
                     (= :future (:tense clause*)) >
                     :else
                     (if (= :perfective (:aspect clause*)) < >)
                     )
        ]
    (cons clause*
          (if-not
              ;; NO REFERENCE TIME DEFINED --- ONLY PRESENT TENSE CONSTRUCTS ALLOWED.
              (string? (:who @reference-time))
            (cond
             (nil? (arrival-time (:who clause*)))
             [(-> response
                  (assoc :reply-yes false
                         :negated true ;; whether reply includes "not"
                         :tense :future
                         :aspect :progressive
                         ))
              "He never will."
              "Malformed: Main clause is no-show."]

             (and (:tense clause*)
                  (not= :present (:tense clause*)))
             [(assoc response :verbatim "Since when?")
              "Since when?"
              "Malformed: Tensed construct without ref time."]

             ;; check whether statement is true in terms of aspect.
             (aspect≼ (arrival-time (:who clause*)) @now)
             ;; check immediacy
             (if (= :immediate (:sense clause))
               (if (adjacently aspect≼
                               (vals arrival-time)
                               (arrival-time (:who clause*))
                               @now)

                 [(assoc response :reply-yes true)
                  "Yes." "Well-formed, and aspect says yes. Immediacy is go."]
                 [(assoc response :reply-yes false)
                  "Not just." "Well-formed, and aspect says yes. Immediacy is no-go."])
               [(assoc response :reply-yes true)
                "Yes." "Well-formed, and aspect says yes. Immediacy is non-issue."])

             :else


             ;; present prog: 
             [(assoc response
                :reply-yes false
                :aspect :perfective
                :negated false
                :tense :present
                )
              "Already has" "Well-formed, but aspect says no."]
             )

            ;; REFERENCE TIME DEFINED
            (cond
             (nil? (arrival-time (:who @reference-time)))
             [(-> response
                  (assoc
                      :who (:who @reference-time)
                      :reply-yes nil
                      :negated true ;; whether reply includes "not"
                      :tense :future
                      :aspect :progressive
                      ))
              "REF'll never come." "Malformed: reference person is a no-show."]

             (not (tense≼ (arrival-time (:who @reference-time)) @now ))
             [

              ;; tense failure
              (-> response
                  (assoc
                      :reply-yes nil
                      :who (:who @reference-time)
                      :aspect :perfective
                      :tense :present
                      :negated (or (= :past (:tense clause*))
                                   (and (= :present (:tense clause*))
                                        (= :perfective (:aspect clause*))))
                      )
                  )
              "REF has already arrived/hasn't arrived yet."
              "Malformed: the reference time's tense makes an untrue assumption."
              (:tense clause*) (:aspect clause*)

              "R" "n"
              tense≼ (arrival-time (:who @reference-time)) @now 
              ]

             (nil? (arrival-time (:who clause*)))
             [(assoc response
                :reply-yes nil
                :negated true
                :tense :future
                :aspect :progressive)

              "MAIN isn't coming" "Malformed: the main clause is a no-show"]


             ;; check whether statement is true in terms of aspect.
             (and (aspect≼ (arrival-time (:who clause*))
                           (arrival-time (:who @reference-time)))
                  ;; the future of the reference is bounded by the present.
                  (or (not (< (arrival-time (:who @reference-time)) @now))
                           (< (arrival-time (:who clause*)) @now)))




             ;; check immediacy
             (if (= :immediate (:sense clause*)) 
               (if (adjacently aspect≼
                               (vals arrival-time)
                               (arrival-time (:who clause*))
                               (arrival-time (:who @reference-time)))
                 [(assoc response :reply-yes true)
                  "Yes." "Well-formed, and aspect says yes. Immediacy is go."]
                 [(assoc response :reply-yes false :negated true)
                  "Not just." "Well-formed, and aspect says yes. Immediacy is no-go."])
               [(assoc response :reply-yes true)
                "Yes." "Well-formed, and aspect says yes. Immediacy is non-issue."])



             ;; statement succeeds in tense but not aspect
             (aspect≼ (arrival-time (:who clause*)) @now (arrival-time (:who @reference-time)))
             [(assoc response
                :reply-yes nil
                :tense :present
                :aspect :progressive
                :negated (= :past (:tense clause*))) 
              "has/n't already arrived"
              "Well-formed, but aspect says no."]

             (and (= :past (:tense clause*))
                  (= :progressive (:tense clause*))
                  (> (arrival-time (:who clause*)) (arrival-time (:who @reference-time))))
             [(assoc response
                :reply-yes false
                :negated false
                :aspect :perfective
                )
              "had already arrived by then"
              "Well-formed, but aspect says no."]

             :else
             [(assoc response :reply-yes false :negated true)
              "No." "Well-formed, but aspect says no."]
             )))))

2.5 Build an answer in the mind

The dispatch in the previous section was the most difficult and enjoyable part of this project; it involved exhaustively exploring the possible bugs in questions. The second-most challenging part was determining what to say in response, given that you've diagnosed the question.

I divided the task of answering into two parts: in the first part, the program describes the kind of answer it wants to give. The answer is in the form of a collection of attributes: who's the subject of the reply? what's the tense and aspect of the reply? Should the reply be negated (i.e. include "not")? and so on.

Here's a representative example, excerpted from the evaluate! code in the previous section:

  (assoc response
    :who (:who @reference-time)
    :reply-yes nil
    :negated true ;; whether reply includes "not"
    :tense :future
    :aspect :progressive
    )

;; This corresponds roughly to the reply: "HE WILL NEVER ARRIVE."

The evaluate! function itself returns a reply in this bag-of-attributes form. The original calling function, ask, then passes the reply to the English language generator which converts the reply into a sensible English sentence.

2.6 Convert the answer into English words

The say subroutine performs the work of translating semantic information into English. It's straightforward to convert tense, aspect, and negation information into an appropriate verb phrase such as "won't have by then". The clever part comes from using contextual information to supply an answer that avoids repeating too much of the question. So for example, if you ask "Has Cuthbert arrived?", the reply substitutes a pronoun for Cuthbert — yielding a natural-sounding "NO; HE NEVER WILL.".

(ask "Has Cuthbert arrived?")
;;["Has Cuthbert arrived?" "NO; HE NEVER WILL"]

In contrast, if the question is "Will Derek have arrived when Cuthbert shows up?", the response merits a repeat of Cuthbert's name:

(ask "Will Derek have arrived when Cuthbert shows up?")
;; ["Will Derek have arrived when Cuthbert shows up?" "CUTHBERT NEVER WILL"]

because replying "HE NEVER WILL" would leave it unclear whether it was Cuthbert or Derek who will never arrive.

(defn say
  "Convert a map of semantic information into a response string."
  [clause m]
  (let [
        pro "they"
        who (if (not= (:who clause) (:who m))
              (:who m)
              (get pronoun-preferred (:who clause) "they" ) )
        vps {
             [:past :perfective] ["had" "hadn't by then"]
             [:past :progressive] ["did" "didn't by then"]

             [:present :perfective] ["already has" "hasn't yet"]
             [:present :progressive] ["is" "isn't"]

             [:future :perfective] ["will have" "won't have by then"]
             [:future :progressive] ["will" "never will"]

             }

        differences
        (->>
         (list :who :tense :aspect)
         ;(concat (keys m) (keys clause))
         distinct
         (map (juxt identity m clause))

         (filter #(not= (nth % 1) (nth % 2)))
         (map (comp vec (partial take 2)))
         (into {})
         )

        affirmation
        (cond (true? (:reply-yes m)) "Yes."
              (false? (:reply-yes m)) "No.")
    ]

    [
     (clojure.string/upper-case

      (cond
       (:verbatim m)
       (:verbatim m)

       (and
        (= (:who m) (:who clause))
        (= (:tense m) (:tense clause))
        (= (:aspect m) (:aspect clause)))
       (str affirmation
            (if (and
                 (false? (:reply-yes m))
                 (= :perfective (:aspect m)))
              " Not by then"

              ))       
       :else
       (str (cond (true? (:reply-yes m)) "Yes; "
                  (false? (:reply-yes m)) "No; ")

            who " "
            (get-in vps [ [(get m :tense :present) (get m :aspect :progressive)] (if (:negated m) 1 0)])
            )))

     differences
     m]


  ))

3 Contributions

For this project, I have developed a question-answering system to model how humans make use of temporal markers and conversational context when speaking. The system interacts through a natural English dialogue, allowing the questioner and the responder to leave out superfluous information, and supplying helpful corrective answers whenever the questioner makes a faulty assumption. All replies are generated from semantic information, and are not simply looked up in a table.

To summarize these points as reported in the introduction:

  1. It's about time
  2. The conversation accumulates context
  3. The program handles mistaken questions.
  4. The responses say just enough.
  5. Replies are generated, not canned.

Footnotes:

1

Namely, alphabetical.

2

When the two people use different pronouns, those technically suffice to disambiguate who is meant; however, in an informal survey of people I know, the unanimous opinion was that even in such cases, using names feels clearer.