Commits · v0.62.0 · Hemant / semgrep

17 Aug, 2021 4 commits

Release 0.62.0 · 5460dcfe
brendon authored 3 years ago

5460dcfe
Handle empty pattern error (#3723) · 3c009382
Brendon Go authored 3 years ago
```
* Handle empty pattern error

* fixup! Handle empty pattern error
```
3c009382

core: Fix fatal error "an equal is already in progress" (#3718) · f16a879f

Iago Abal authored 3 years ago

And added an optimized `Pattern_match.uniq` function to replace
`Common.uniq_by`.

Note that Semgrep returns the same matches as before, but sometimes in
a different order.

This issue was reported by a customer weeks ago but we could not figure
it out without an example. I just happened to hit the same bug when
benchmarking some taint rules.

test plan:

    % make test

    % semgrep-core -timeout 5 -j 1 -profile -lang js \
         -config semgrep-rules/typescript/react/security/audit/react-props-injection.yaml \
         semgrep/parsing-stats/lang/javascript/tmp/mui-org-material-ui/test/bundling/fixtures/next-webpack5

      #^ Try it several times, it no longer triggers a fatal error

Rule react-props-injection contains the following pattern:

    import $PROPS from "...";
    ...

This pattern leads to more than 90k matches in
lang/javascript/tmp/mui-org-material-ui/test/bundling/fixtures/next-webpack5/pages/next-webpack.fixture.js
which causes Common.uniq_by to "explode". If we are unlucky and the
analysis timesout while we are testing the equality of two ASTs, then
`AST_utils.busy_with_equal` is left in the wrong state and all the
subsequent equality tests will lead to a fatal error.

f16a879f

[C++] more refactoring in ast_cpp.ml (#3722) · 70ebf7e0
Yoann Padioleau authored 3 years ago
```
test plan:
make test
```
70ebf7e0

14 Aug, 2021 1 commit
- [C++] more (#3719) · 885363cf
  Yoann Padioleau authored 3 years ago
```
test plan:
make test
```
  885363cf
13 Aug, 2021 2 commits
- [C++] Progress (#3717) · 2f88fa0e
  Yoann Padioleau authored 3 years ago
```
* [C++] more Parse_cpp_tree_sitter.ml todos

This will help PA-14

test plan:
make test

* progress

* fix CI
```
  2f88fa0e
- Avoid exit failure with SARIF output and suppressed findings (#3715) · 33bfa7a0
  mschwager authored 3 years ago
```
Fixes #3680.
```
  33bfa7a0
12 Aug, 2021 1 commit
- Fix 'pattern-regex' with completely empty file (#3707) · b01e66e5
  mschwager authored 3 years ago
```
Fixes #3705.
```
  b01e66e5
11 Aug, 2021 4 commits

[Hack] Add Semgrep extension tests (#3704) · bd313dd7

David Frankel authored 3 years ago

* Combine else-if tokens

* Add type param support

* [Hack] Add basic testing

* Finish alpha features

* Add more tests (still need to update semgrep-hack ref)

* Improve dots_params

* [Hack] Extend metavariables

* Update semgrep-hack version

* Add hacklang to semgrep targets

* Update error messages

bd313dd7

adding more options to semgrep sarif formatter (#3697) · 28c26491
kingbbello authored 3 years ago
```
* adding more options to semgrep sarif formatter

* add changelog

* handle empty tag casse
```
28c26491

Optimize pattern $X (#3703) · 24e8cabf

Iago Abal authored 3 years ago

* Optimize pattern $X

Fixes: bfc4f3da ("Revert `pattern: $X` optimization (#3478)")
Fixes: 32e88975 ("Separate pattern: $X in anded patterns to a special field (#3435)")

test plan:

    % make test # tests included

    % semgrep-core -lang py -config \
          ~/semgrep/semgrep-core/tests/OTHER/rules/pattern-x-1.yaml \
          bench/django/input/django
      #^ now takes ~6 seconds (i.e., almost 10x faster)

* engine: Clean up Specialize_formula

test plan:
make test

* Emma's comments

24e8cabf

Ensure we don't accidentally return folders as targets (#3701) · 36f9ac2e

Martin Jambon authored 3 years ago

* Ensure that paths aren't directories

Workaround for issue with git submodules and git ls-files, see #3660 for details.

* Add submodule fix to changelog

* Run pre-commit hooks

* Ensure we don't end up with folders in a list of targets

* Update CHANGELOG.md

36f9ac2e

10 Aug, 2021 4 commits
- Update README.md (#3709) · e9d028f4
  raghavjain3 authored 3 years ago
  
  e9d028f4
- Improve Parse_cpp_tree_sitter.ml (#3706) · 011b535a
  Yoann Padioleau authored 3 years ago
  
  011b535a
- C++ (#3696) · b70b62b1
  Yoann Padioleau authored 3 years ago
```
progress

test plan:
make test
```
  b70b62b1
- Remove documentation from semgrep/doc (#3700) · 214ecb32
  Emma Jin authored 3 years ago
  
  214ecb32
09 Aug, 2021 7 commits

Fixups and test for running perf benchmarks via docker (#3301) · a64a0129

Brian Kroth authored 3 years ago

* fixups and tests for running benchmarks in docker

* need to pass --quiet to avoid logging info from failing to json parse

* Update perf/run-benchmarks

a64a0129

core: Better communicate errors to the CLI (#3686) · 7e4bf4ae

Iago Abal authored 3 years ago

- Add backtrace to fatal errors to help diagnosing bug reports (Pfff).
- Report rule evaluation errors instead of "silently" logging them.

Helps #3547

test plan:

    % cat fatal.py
    def foo():
    return 1
    }
    % semgrep -l py -e 'x' fatal.py
    running 1 rules...
    semgrep-core reported a fatal error:
    -----
    Fatal Error: (Failure "Lexer_python.top_mode: empty stack")
    Raised at Parse_target.run in file "src/parsing/Parse_target.ml", line 176, characters 17-26
    ...
    -----
    Please file a bug report at https://github.com/returntocorp/semgrep/issues/new/choose
    ...

    % cat warn.yaml
    rules:
    - id: test
      languages: [python]
      patterns:
      - pattern: $X
      - metavariable-pattern:
          metavariable: $Y
          pattern: x
      message: Test
      severity: ERROR
    % cat warn.py
    x
    % semgrep -c warn.yaml -v warn.py
    ...
    semgrep-core reported a matching error
    --> matching internal error: rule test: metavariable-pattern failed because $Y it not in scope, please check your rule
    ...

7e4bf4ae

use latest pfff (#3695) · 6bb5455d
Yoann Padioleau authored 3 years ago

6bb5455d
Parse_php_tree_sitter.ml boilerplate (#3694) · 860d67bc
Yoann Padioleau authored 3 years ago
```
This will help https://github.com/returntocorp/semgrep/issues/3576

test plan:
make test
```
860d67bc
add semgrep-php (#3693) · a925a9ba
Yoann Padioleau authored 3 years ago

a925a9ba

[Java] fix the range and autofix of Cast expression (#3692) · bc0051dc

Yoann Padioleau authored 3 years ago

This closes https://github.com/returntocorp/semgrep/issues/3669
This is a bit ugly. The right fix would be to extend
the AST_generic.expr record to store the leftmost and rightmost
token in the expr, so we would not need to abuse tok in the
expr_kind itself.

test plan:
test file included

bc0051dc

Don't consider test files in language stats for the repository on Github (#3688) · 9db4d5a9
Isaac Evans authored 3 years ago

9db4d5a9

08 Aug, 2021 1 commit

[OCaml] support open XXx entity aliasing by using LSP (#3687) · 1b2e39df

Yoann Padioleau authored 3 years ago

* [OCaml] support open XXx entity aliasing by using LSP

test plan:
```
yy -no_bloom_filter -lsp -lang ocaml -e 'AST_generic.fake_bracket $X' tests/test_lsp.ml -log_config_file /tmp/xxx
+ /home/pad/yy/_build/default/src/cli/Main.exe -no_bloom_filter -lsp -lang ocaml -e 'AST_generic.fake_bracket $X' tests/test_lsp.ml -log_config_file /tmp/xxx
START
tests/test_lsp.ml:17
   AST_generic.fake_bracket [] |> ignore;
tests/test_lsp.ml:18
   G.fake_bracket [] |> ignore;
"AST_generic"
tests/test_lsp.ml:19
   fake_bracket [] |> ignore;

yy -no_bloom_filter -lsp -lang ocaml -e 'AST_generic.Call ($X, $Y)' tests/test_lsp.ml -log_config_file /tmp/xxx
+ /home/pad/yy/_build/default/src/cli/Main.exe -no_bloom_filter -lsp -lang ocaml -e 'AST_generic.Call ($X, $Y)' tests/test_lsp.ml -log_config_file /tmp/xxx
START
tests/test_lsp.ml:6
   let res0 = AST_generic.Call (Int (None, fake ""), fb []) in
tests/test_lsp.ml:6
   let res0 = AST_generic.Call (Int (None, fake ""), fb []) in
tests/test_lsp.ml:7
   let res1 = G.Call (Int (None, fake ""), fb []) in
tests/test_lsp.ml:7
   let res1 = G.Call (Int (None, fake ""), fb []) in
"AST_generic"
tests/test_lsp.ml:8
   let res2 = Call (Int (None, fake ""), fb []) in
"AST_generic"
tests/test_lsp.ml:8
   let res2 = Call (Int (None, fake ""), fb []) in

yy -no_bloom_filter -lsp -lang ocaml -e '| AST_generic.Call ($X, $Y)' tests/test_lsp.ml -log_config_file /tmp/xxx
+ /home/pad/yy/_build/default/src/cli/Main.exe -no_bloom_filter -lsp -lang ocaml -e '| AST_generic.Call ($X, $Y)' tests/test_lsp.ml -log_config_file /tmp/xxx
START
tests/test_lsp.ml:10
   | AST_generic.Call (x, (_, [], _)) -> 1
tests/test_lsp.ml:11
   | G.Call (x, (_, [_], _)) -> 1
"AST_generic"
tests/test_lsp.ml:12
   | Call (x, y) -> 1

```

* misc

1b2e39df

06 Aug, 2021 5 commits

Refactor m_name, factorize more code (#3684) · 1fd2534d

Yoann Padioleau authored 3 years ago

* Refactor m_name, factorize more code

test plan:
make test

* more

* more

* factorize m_type naming part part in m_name

* factorize m_attr

1fd2534d

[OCaml] module aliasing for Constructor/PatConstructor via m_name (#3682) · 9db5f07a

Yoann Padioleau authored 3 years ago

We should be able to factorize more code and move more
aliasing logic in m_name, for TyN and NamedAttr too
(and maybe m_expr).

test plan:
test file included

9db5f07a

core: Fix parsing of numeric literals in rule files (#3675) · 203997c4
Iago Abal authored 3 years ago
```
test plan:
1. make test # tests included
2. semgrep -l py -e '42' any_file.py
```
203997c4

Naming: resolve names for Constructor and PatConstructor (#3681) · 8606cc70

Yoann Padioleau authored 3 years ago

This also factorize a bit name resolution for all 'name' types.

test plan:
```
pad@yrax yy (naming_ctor)]$ yy -lang ocaml -dump_named_ast tests/ocaml/aliasing_qualified_contructor.ml
+ /home/pad/yy/_build/default/src/cli/Main.exe -lang ocaml -dump_named_ast tests/ocaml/aliasing_qualified_contructor.ml
[0.038  Info       Main.Dune__exe__Main ] loaded log_config.json
[0.038  Info       Main.Dune__exe__Main ] Executed as: /home/pad/yy/_build/default/src/cli/Main.exe -lang ocaml -dump_named_ast tests/ocaml/aliasing_qualified_contructor.ml
[0.038  Info       Main.Dune__exe__Main ] Version: semgrep-core version: v0.61.0-9-g14616953-dirty, pfff: 0.42
[0.038  Info       Main.Parse_target    ] trying to parse with Pfff parser tests/ocaml/aliasing_qualified_contructor.ml
[0.038  Info       Main.Parse_target    ] Parse_target.parse_and_resolve_name_use_pfff_or_treesitter done
Pr(
  [DefStmt(
     ({
       name=EN(
              Id(("G", ()),
                {
                 id_resolved=Ref(Some((ImportedModule(
                                         DottedName([("AST_generic", ())])),
                                       1)));
                 id_type=Ref(None); id_constness=Ref(None); }));
       attrs=[]; tparams=[]; },
      ModuleDef({mbody=ModuleAlias([("AST_generic", ())]); })));
   DefStmt(
     ({
       name=EN(
              Id(("foo", ()),
                {id_resolved=Ref(None); id_type=Ref(None);
                 id_constness=Ref(None); }));
       attrs=[]; tparams=[]; },
      FuncDef(
        {fkind=(Function, ()); fparams=[ParamPattern(PatLiteral(Unit(())))];
         frettype=None;
         fbody=OtherStmt(OS_ExprStmt2,
                 [E(
                    Constructor(
                      IdQualified(
                        (("Call", ()),
                         {name_qualifier=Some(QDots([("G", ())]));
                          name_typeargs=None; }),
                        {
                         id_resolved=Ref(Some((ImportedEntity(
                                                 [("AST_generic", ());
                                                  ("Call", ())]),
                                               0)));
                         id_type=Ref(None); id_constness=Ref(None); }),
                      [L(Int((Some(1), ()))); L(Int((Some(2), ())))]))]);
         })))])
```

8606cc70

Open files in binary mode so as to bypass CRLF translation on Windows. (#3663) · 14616953
Martin Jambon authored 3 years ago
```
* Open files in binary mode so as to bypass CRLF translation on Windows.

* Update pfff

* Update pfff

* Update pfff
```
14616953

05 Aug, 2021 6 commits

core: Make sure memory-limit test works (#3676) · 39644184
Iago Abal authored 3 years ago
```
The GC alarm doesn't trigger so reliably it seems, the test was
failing on my laptop.
```
39644184
Fix version deduplication bug was fixed in changelog (#3679) · 3e622f2f
Brendon Go authored 3 years ago

3e622f2f
Add deprecation of experimental features to changelog 0.61.0 (#3678) · 8e360095
Brendon Go authored 3 years ago

8e360095

Refactor Constructor to take a name, not a dotted_ident (#3674) · 812908e2

Yoann Padioleau authored 3 years ago

This will allow later to perform aliasing also on constructor calls,
e.g. finding 'G.Call (foo, bar)' when looking for AST_generic.Call($X,
$Y)
This PR just do the Constructor refactoring.

test plan:
make

812908e2

[OCaml] basic support for module aliasing (#3673) · a83c2c8f

Yoann Padioleau authored 3 years ago

It is quite common in OCaml (at least in my OCaml codebases),
to use module aliasing such as 'module G = AST_generic'.
We want that a pattern like 'AST_generic.foo' also match
code where the AST_generic module has been aliased, just
like we do in Python or Go where we handle import aliasing.

test plan:
test file included

a83c2c8f

More fixes for parsing-stats CI job (#3672) · 10fb4fa2

Yoann Padioleau authored 3 years ago

The main change is that we just skip big files. They cause
too much problems and anyway those files are not real,
they are autogenerated, or test files for benchmarks, or just
huge data converted as code (e.g., unicode tables).

Do we do similar filtering in semgrep itself? I remember martin
at some point was looking at the size of files in spacegrep.

test plan:
make test
./run-lang c

10fb4fa2

04 Aug, 2021 4 commits

Merge pull request #3668 from returntocorp/release-0.61.0 · d2654684
Brendon Go authored 3 years ago
```
Release 0.61.0
```
d2654684
Remove update-docker call (#3670) · 45d4387b
mschwager authored 3 years ago

45d4387b
Release 0.61.0 · 132bb4b1
Matt Schwager authored 3 years ago

132bb4b1

Many fixes to improve the parsing_stat CI job (#3664) · 3cf4f9e6

Yoann Padioleau authored 3 years ago

* Many fixes to improve the parsing_stat CI job

The Lua parsing stat job was failing because it was using
a non-github repo which triggered an RPC error at clone time.

The Kotlin parsing stat job was failing because it was using
CST.dump_tree which output things on stdout, which then messes up
the json.

The C parsing stat job was failing because of some out of memory
fatal errors on huge files. I tried to use Memory_limit
but ran in many issues. It's hard to intercept those Out_of_memory
reliably.

test plan:
./run-lang --upload lua
./run-lang --upload kotlin
./run-lang --upload c

* iago's comment

3cf4f9e6

03 Aug, 2021 1 commit

Remove location information in cli errors with temp files (#3651) · dc3c7331

Emma Jin authored 3 years ago

* Remove location information in cli errors with temp files

Since the temp file error message is not helpful, remove it.

Test plan:

Modify `metavar_pattern_lang.yaml` in tests/OTHER/rules to have the pattern `bad eval_C("$CODE")`. Then run

```
(semgrep) ➜  rules git:(develop) ✗ semgrep --config metavar_pattern_lang.yaml metavar_pattern_lang.py
running 1 rules...
semgrep error: invalid pattern

Pattern `bad eval_C("$CODE")` could not be parsed as a Python semgrep pattern
```

* Updated snapshots

* Changelog

dc3c7331

GitLab

Menu