feat: add get_abstract tool for token-efficient paper metadata retrieval#67
feat: add get_abstract tool for token-efficient paper metadata retrieval#67yangziyu-620 wants to merge 1 commit intoblazickjp:mainfrom
Conversation
Add a lightweight get_abstract tool that fetches paper title, authors, abstract, categories, and published date via arXiv API without downloading the full PDF. This allows users to assess paper relevance before committing to a full download+read, saving significant tokens.
There was a problem hiding this comment.
Pull request overview
This PR adds a new get_abstract MCP tool that allows users to fetch paper metadata (title, authors, abstract, categories, published date, PDF URL) directly from the arXiv API without downloading the full PDF. This provides a lightweight way to assess paper relevance before committing to the heavier download_paper + read_paper workflow, saving significant tokens.
Changes:
- New
get_abstracttool implementation intools/get_abstract.pywith a tool definition and async handler that queries the arXiv API by paper ID - Integration of the new tool into the module exports (
__init__.py) and server routing (server.py) - Documentation update in
CLAUDE.mdto list the new tool
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/arxiv_mcp_server/tools/get_abstract.py |
New tool implementation: abstract_tool definition and handle_get_abstract async handler that fetches paper metadata via the arxiv library |
src/arxiv_mcp_server/tools/__init__.py |
Exports the new abstract_tool and handle_get_abstract |
src/arxiv_mcp_server/server.py |
Imports, lists, and routes the new tool in list_tools() and call_tool() |
CLAUDE.md |
Documents the new get_abstract.py tool in the architecture overview |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| async def handle_get_abstract(arguments: Dict[str, Any]) -> List[types.TextContent]: | ||
| """Handle requests to get a paper's abstract without downloading.""" | ||
| try: | ||
| paper_id = arguments["paper_id"] | ||
| client = arxiv.Client() | ||
| search = arxiv.Search(id_list=[paper_id]) | ||
| results = list(client.results(search)) | ||
|
|
||
| if not results: | ||
| return [ | ||
| types.TextContent( | ||
| type="text", | ||
| text=json.dumps( | ||
| { | ||
| "status": "error", | ||
| "message": f"Paper {paper_id} not found on arXiv", | ||
| } | ||
| ), | ||
| ) | ||
| ] | ||
|
|
||
| paper = results[0] | ||
| return [ | ||
| types.TextContent( | ||
| type="text", | ||
| text=json.dumps( | ||
| { | ||
| "status": "success", | ||
| "paper_id": paper_id, | ||
| "title": paper.title, | ||
| "authors": [a.name for a in paper.authors], | ||
| "abstract": paper.summary, | ||
| "categories": paper.categories, | ||
| "published": paper.published.isoformat(), | ||
| "pdf_url": paper.pdf_url, | ||
| }, | ||
| indent=2, | ||
| ), | ||
| ) | ||
| ] | ||
|
|
||
| except Exception as e: | ||
| return [ | ||
| types.TextContent( | ||
| type="text", | ||
| text=json.dumps( | ||
| {"status": "error", "message": f"Error: {str(e)}"} | ||
| ), | ||
| ) | ||
| ] |
There was a problem hiding this comment.
This new tool handler lacks test coverage. The codebase has tests for similar tool handlers (tests/tools/test_download.py, tests/tools/test_search.py) that mock the arxiv.Client and verify behavior for success, not-found, and error cases. Consider adding a tests/tools/test_get_abstract.py with at least:
- A test for a successful abstract retrieval (mocking
arxiv.Client.resultsto return a mock paper) - A test for a paper not found (mock returning empty results)
- A test for an API error (mock raising an exception)
Summary
get_abstracttool that fetches paper metadata (title, authors, abstract, categories, published date) via arXiv API without downloading the full PDFdownload_paper+read_paperworkflow, saving significant tokensChanges
tools/get_abstract.pytools/__init__.pyserver.pyCLAUDE.mdTest plan