Skip to content

Conversation

@jjtolton
Copy link
Contributor

@jjtolton jjtolton commented Oct 24, 2025

Implements the double bar (||) operator as specified in:
https://www.complang.tuwien.ac.at/ulrich/iso-prolog/double_bar

as discussed in #3132 (comment)

Summary

Adds support for combining double-quoted strings with partial list notation:

  • "abc"||K produces [a,b,c|K]
  • "a"||"b"||"c" produces [a,b,c]
  • ""||K unifies with K (empty collapse)
  • Multi-line support with comments
  • Proper syntax validation

Changes

Lexer (src/parser/lexer.rs)

  • Added DoubleBar token type
  • Modified lexer to detect || (vs single |)

Parser (src/parser/parser.rs)

  • Added DoubleBar to TokenType enum
  • Implemented operator handling with priority 1 (as per spec)
  • Validates that || only appears after string literals:
    • ✅ Allows: "abc"||K
    • ❌ Rejects: K||[] (variable)
    • ❌ Rejects: ("a")||[] (parenthesized)
  • Handles all edge cases:
    • Non-empty strings: Creates PartialString term
    • Code lists: Replaces tail in cons cells
    • Empty strings: Correctly collapse to tail

Tests (src/tests/double_bar.pl)

Comprehensive Prolog integration tests (11 tests, all passing):

  • Basic functionality
  • Multi-line with line comments
  • Multi-line with block comments
  • Empty string edge cases
  • Chaining multiple strings
  • All invalid syntax cases

Examples

?- L = "abc"||K.
   L = [a,b,c|K].

?- L = "a"||"b"||"c".
   L = [a,b,c].

?- L = ""||K.
   L = K.

?- L = "a"|| % multi-line
       "b"||
       "c".
   L = [a,b,c].

?- L = K||[].
   error(syntax_error(incomplete_reduction),...).

?- L = ("a")||[].
   error(syntax_error(incomplete_reduction),...).

Testing

All Prolog tests pass:

./target/debug/scryer-prolog -f --no-add-history src/tests/double_bar.pl \
  -f -g "use_module(library(double_bar_tests)), double_bar_tests:main_quiet(double_bar_tests)"

Output: All tests passed

let is_valid = if let Some(last_term) = self.terms.last() {
match last_term {
Term::CompleteString(_, _) => true,
Term::Cons(_, _, _) => true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Cons case looks suspicious. What about [_]||Rs ?

@hurufu
Copy link
Contributor

hurufu commented Oct 24, 2025

IMO this can be done using 3 operators: prefix |, suffix | and infix || during term expansion. Similarly to what I did here.

@UWN
Copy link

UWN commented Oct 24, 2025

IMO this can be done using 3 operators: prefix |, suffix | and infix || during term expansion. Similarly to what I did here.

1mo, there is no term-expansion during read/1.

2do, it must be done with priority lower than 1. Otherwise user defined operators would interfere.

3tio, [1,2]||L is invalid. Only double quoted lists meaning chars or codes may be used.

4to, functional notation is not an alternative.

5to, '|' must not be a prefix operator. And an operator cannot suffix, just because it is often defined as an infix.

@bakaq
Copy link
Contributor

bakaq commented Oct 24, 2025

Note that the proposed syntax for this is much more lax than this, and arbitrary layout chars (whitespace and comments) should be allowed before, between and after each of the bars. The added tests currently only check layout chars after the second bar.

@jjtolton
Copy link
Contributor Author

Fixed the issue raised by @triska regarding [_]||Rs being incorrectly accepted.

Changes

The parser now correctly rejects list literals before ||:

?- [1,2,3]||K.
   error(syntax_error(incomplete_reduction),...)

?- [_]||Rs.
   error(syntax_error(incomplete_reduction),...)

Implementation

Removed the Term::Cons case from validation - it was accepting ANY list construct. Now only CompleteString and PartialString terms are allowed before ||, ensuring only double-quoted string literals work:

  • "abc"||K - valid string literal
  • [1,2,3]||K - list literal (now rejected)
  • [_]||Rs - list with variable (now rejected)

All existing valid cases still work, and all Prolog tests pass.

@bakaq
Copy link
Contributor

bakaq commented Oct 24, 2025

I'm pretty sure that [a,b,c]||S still gets unexpectedly accepted like this, because the [a,b,c] is currently being parsed as a string.

@jjtolton
Copy link
Contributor Author

jjtolton commented Oct 24, 2025

I'm pretty sure that [a,b,c]||S still gets unexpectedly accepted like this, because the [a,b,c] is currently being parsed as a string.

So it's syntactically invalid even if it's semantically equivalent to a string? (this indeed is a valid parse under af38843)

@bakaq
Copy link
Contributor

bakaq commented Oct 24, 2025

Yep, it does accept it:

?- A = [a,b,c]||X.
   A = [a,b,c|X].

So it's syntactically invalid even if it's semantically equivalent to a string?

Yes, the double bar proposal is just syntax, and it should work only with double quotes syntax, and not any other case, as specified in the syntax description term = double quoted list, bar, bar, term ;.

"asdfasd" | | /* a */ A is valid, [a,b,c]||X or even ("asdfa")||X is not.

@jjtolton
Copy link
Contributor Author

Yep, it does accept it:

?- A = [a,b,c]||X.
   A = [a,b,c|X].

So it's syntactically invalid even if it's semantically equivalent to a string?

Yes, the double bar proposal is just syntax, and it should work only with double quotes syntax, and not any other case, as specified in the syntax description term = double quoted list, bar, bar, term ;.

"asdfasd" | | /* a */ A is valid, [a,b,c]||X or even ("asdfa")||X is not.

ah kkk

@jjtolton
Copy link
Contributor Author

Yep, it does accept it:

?- A = [a,b,c]||X.

   A = [a,b,c|X].

So it's syntactically invalid even if it's semantically equivalent to a string?

Yes, the double bar proposal is just syntax, and it should work only with double quotes syntax, and not any other case, as specified in the syntax description term = double quoted list, bar, bar, term ;.

"asdfasd" | | /* a */ A is valid, [a,b,c]||X or even ("asdfa")||X is not.

9b04bcc

@jjtolton
Copy link
Contributor Author

Fixed #3160: Empty list [] was incorrectly accepted before ||.

Commit: 16ea9d3

—J.J.'s robot.

jjtolton and others added 7 commits November 29, 2025 16:08
Adds support for the double bar operator as specified in:
https://www.complang.tuwien.ac.at/ulrich/iso-prolog/double_bar

Changes:
- Lexer: Added DoubleBar token, detects || vs single |
- Parser: Handles "string"||Tail syntax with priority 1
  - Validates that || only appears after string literals
  - Rejects || after variables: K||[] => syntax_error
  - Rejects || after parenthesized expressions: ("a")||[] => syntax_error
  - Creates PartialString for non-empty strings
  - Replaces list tail for code lists
  - Empty strings correctly collapse (""||K unifies with K)
- Tests: Comprehensive Prolog integration tests covering:
  - All spec examples including multi-line with comments
  - Edge cases (empty strings, chaining)
  - Syntax validation for all invalid cases

All Prolog tests pass. Examples:
- "abc"||K => [a,b,c|K]
- "a"||"b"||"c" => [a,b,c]
- ""||K => K
- "a"|| % comment
  "b"||"c" => [a,b,c]
- K||[] => syntax_error (as required)
- ("a")||[] => syntax_error (as required)
Addresses feedback from @triska about [_]||Rs being incorrectly accepted.

The double bar operator should only work with double-quoted string literals,
not arbitrary list constructs. This commit:

- Removes Term::Cons validation case (was accepting any list)
- Only allows Term::CompleteString and Term::PartialString
- Simplifies push_binary_op to handle only string terms
- Removes unused replace_list_tail function
- Documents invalid cases in test file

Invalid cases now correctly rejected:
- [1,2,3]||K => syntax_error
- [_]||Rs => syntax_error
- K||[] => syntax_error (already worked)
- ("a")||[] => syntax_error (already worked)

All valid string cases still work:
- "abc"||K => [a,b,c|K]
- "a"||"b"||"c" => [a,b,c]
- ""||K => K
Addresses feedback from @bakaq about [a,b,c]||S being incorrectly accepted
and support for spaced "| |" syntax per spec.

Issue 1: List syntax like [a,b,c] was incorrectly accepted
---------------------------------------------------------
The issue was that in chars mode, reduce_list() converts lists like [a,b,c]
to CompleteString terms, making them indistinguishable from actual string
literals "abc" by the time the || validation runs.

Solution: Introduce LIST_TERM spec constant to mark terms originating from
list syntax ([...]), distinct from string literals ("..."). The || operator
now correctly rejects list syntax while accepting only double-quoted strings.

Issue 2: Spaced "| |" syntax not supported
-------------------------------------------
The spec explicitly shows "a"| |"b"| |"c" as valid syntax with spaces
between the bars. Modified HeadTailSeparator handling to peek ahead and
detect two consecutive | tokens, treating them as DoubleBar.

Comments are supported in all positions per spec:
- Before bars: "a" /* comment */ || "b"
- After bars: "a" || /* comment */ "b"
- Between bars: "a" | /* comment */ | "b"
- Multiple positions: "a" /* c1 */ | /* c2 */ | /* c3 */ "b"
- Line comments: "a" | % comment
                     | "b"

Changes:
- src/parser/ast.rs: Add LIST_TERM = 0x5000 constant
- src/parser/parser.rs:
  * Set LIST_TERM spec in reduce_list()
  * Check LIST_TERM in || validation to reject list syntax
  * Detect | | token pair and handle as DoubleBar
- src/tests/double_bar.pl:
  * Document [a,b,c]||S as invalid case
  * Add tests for spaced "| |" syntax
  * Add comprehensive tests for comments in all positions

Tested:
✅ [a,b,c]||S correctly rejected
✅ [1,2,3]||K correctly rejected
✅ [_]||Rs correctly rejected
✅ "abc"||K works correctly
✅ "abc" | | K works correctly (spaced syntax)
✅ "a" | | "b" | | "c" works correctly
✅ Comments before, after, and between bars work correctly
✅ All 21 integration tests pass
✅ All cargo tests pass (no regressions)
Implements support for the double bar (||) operator when double_quotes
flag is set to codes, fixing issue mthom#3142. The operator now correctly
handles string literals in all three modes (chars, codes, atom).

Changes:
- Modified parser to accept Term::Cons and empty list literals for
  codes-mode strings before the || operator
- Added replace_cons_tail helper to properly replace the tail of
  codes-mode lists
- Extended push_binary_op to handle codes-mode string concatenation

Tests:
- Added 35 comprehensive tests (8 chars mode + 27 codes mode)
- Full parity in comment handling, spacing, unicode, and edge cases
- All 56 tests passing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The spaced bar syntax `| |` failed with syntax_error(incomplete_reduction)
when double_quotes flag was set to codes. The compact syntax `||` worked
fine in both modes.

Root cause: The spaced bar validation only accepted CompleteString and
PartialString terms, but in codes mode "abc" becomes Term::Cons([97,98,99]).

Fix: Add Term::Cons and empty list handling to spaced bar validation,
matching the compact || validation logic.

Also:
- Add discontiguous(test/2) directive to double_bar.pl (needed because
  set_prolog_flag directives appear between test clauses)
- Add double_bar_tests.stdout for proper CLI test registration
- double_bar.pl: 29 chars mode tests using test_framework.pl
- double_bar_codes.pl: 27 codes mode tests standalone (no test_framework.pl)
  - Defines format helpers BEFORE set_prolog_flag(double_quotes, codes)
  - Uses copy_term/2 to avoid variable sharing between tests
  - Run via CLI with -g main

This avoids the issue where test_framework.pl's format("~s",...) doesn't
handle character codes (only atoms), making main() fail for codes mode tests.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Document the abstract syntax from the spec:
  term = double quoted list, bar, bar, term ;

This confirms that the RIGHT side (tail) can be any term at priority 0,
including atoms and numbers, not just variables. Reference WG17 2025-06-02
decision accepting option 1 (only after double quotes).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
jjtolton and others added 2 commits November 29, 2025 16:50
- Empty list [] now uses LIST_TERM spec (was TERM), consistent with non-empty lists
- Added DoubleBar check in compute_arity_in_list to reject [a,b]||X patterns
- Added syntax error tests for invalid || usage:
  - []||X (empty list)
  - [a,b]||X (non-empty list)
  - X||Y (variable)
  - foo||X (atom)
  - 123||X (number)

Per WG17 2025 spec: || only valid after double-quoted strings, not list notation.
Reference: https://www.complang.tuwien.ac.at/ulrich/iso-prolog/double_bar

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Invalid syntax cases now tested via double_bar_syntax_errors.md trycmd tests.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants