Compare commits

..

289 Commits

Author SHA1 Message Date
Tanner ff39918591 Modify search to work with article contents 2026-06-13 11:54:52 -06:00
Tanner 11ed648791 Ignore data.ms.old/ 2026-06-13 11:54:35 -06:00
Tanner eb2d8be765 feat: Add "Search in article" filter checkbox to results page
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-16 17:06:53 -07:00
Tanner 795c29c07a fix: Extract prose from HTML text field for indexing
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-12 16:14:36 -07:00
Tanner 643b46d5c4 refactor: Adapt Meilisearch integration to v1.29.0 API
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-12 13:34:48 -07:00
Tanner 82cc12c70a feat: Add MeiliSearch API key authentication 2025-12-12 13:34:43 -07:00
Tanner 8036fc9c1b fix: Remove single dollar sign math rendering due to false positives
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-05 17:24:40 +00:00
Tanner bdf5c5b533 Fix dt dd tags margin 2025-12-05 00:59:02 +00:00
Tanner 735165b4e5 fix: Adjust comment metadata indentation in comments
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-05 00:49:13 +00:00
Tanner a405d2276f fix: Refactor comments with DL/DD for text browser compatibility
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-05 00:45:00 +00:00
Tanner b3fc59f60a chore: Remove unused nojs div 2025-12-05 00:44:58 +00:00
Tanner ac0436e997 Fix buttons in color themes 2025-12-05 00:35:06 +00:00
Tanner c9ebc60087 Clear stories first on checkbox change 2025-12-04 23:12:30 +00:00
Tanner 546285d7f3 Don't setStories when existing list is empty 2025-12-04 22:57:26 +00:00
Tanner 1b1b938ac7 Style checkbox 2025-12-04 22:55:23 +00:00
Tanner fa7354a376 fix: Implement custom transparent checkbox for dark mode visibility
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 22:31:11 +00:00
Tanner 01b0e87a79 style: Apply transparent background to checkboxes 2025-12-04 22:31:07 +00:00
Tanner 32e438d15a fix: Cancel pending story fetches on filter change to prevent UI jumps
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 22:24:28 +00:00
Tanner 32355ff408 feat: Fetch smallweb stories iteratively until limit met
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 22:18:22 +00:00
Tanner d2aab946dc feat: Add domain exclusion to smallweb list loading 2025-12-04 22:18:19 +00:00
Tanner cf092908eb feat: Add smallweb filter checkbox and server-side filtering
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 22:09:11 +00:00
Tanner ab48c8ab1e Put the loading status down below 2025-12-04 21:10:20 +00:00
Tanner 56f27efa68 fix: Detect and render inline math using single dollar delimiters
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 20:56:14 +00:00
Tanner b36b437c19 fix: Convert inline align environments to display math
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 20:50:48 +00:00
Tanner 86b9ab6479 chore: Adjust console.log placement in Article component 2025-12-04 20:45:21 +00:00
Tanner a6537e27d3 chore: Add debug log for math block detection 2025-12-04 20:42:55 +00:00
Tanner bbe02400e8 fix: Render LaTeX expressions that are entire element contents
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 20:35:51 +00:00
Tanner 283952a31b Add latex packages 2025-12-04 20:31:40 +00:00
Tanner 15314874b5 feat: Add LaTeX math rendering support
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 20:29:13 +00:00
Tanner 87f74e4422 fix: Extend direct HTML rendering to math elements
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 20:24:08 +00:00
Tanner 78f09ab937 fix: Prevent React warnings for SVG attributes
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 20:23:08 +00:00
Tanner 78734b1dc9 Move logos into public directory 2025-12-04 19:54:56 +00:00
Tanner dc5aca5999 Downgrade humanize 2025-12-04 19:53:13 +00:00
Tanner b58610cf19 Freeze requirements 2025-12-04 19:51:42 +00:00
Tanner b476741b96 Don't locate css file on server 2025-12-04 19:49:19 +00:00
Tanner 07b2a702f8 chore: Remove conditional CSS import and improve alt attributes 2025-12-04 19:29:04 +00:00
Tanner dae6f831c3 feat: Display relative time on non-JS article info line
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 19:11:27 +00:00
Tanner 021c6cab59 style: Remove zero-width spaces from story info 2025-12-04 19:11:24 +00:00
Tanner 30fde32b28 feat: Render homepage feed server-side
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 18:42:14 +00:00
Tanner 7839fced8d feat: Include QotNews header for non-JS users
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 18:38:19 +00:00
Tanner 099d777faa feat: Add relative timestamps and permalinks to comments
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 18:35:43 +00:00
Tanner 518c9fe765 fix: Widen comments container on story page
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 18:32:57 +00:00
Tanner cf0fb085b6 refactor: Align non-JS comments page structure and style
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 18:29:41 +00:00
Tanner d1ec2a1a62 style: Match non-JS article page styling and layout to JS version
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 18:26:06 +00:00
Tanner e463a6da53 feat: Link compiled CSS bundle for non-JS client
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 18:16:50 +00:00
Tanner fe0a4dedb4 feat: Add static rendering for article pages
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-04 18:01:00 +00:00
Tanner d62d99f05c Only wrap code in comments 2025-12-03 04:18:36 +00:00
Tanner a173858629 fix: Refine code block detection to ignore inline <code>
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-03 03:57:08 +00:00
Tanner ddb969125d fix: Refine code block detection to exclude inline code
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-03 03:55:18 +00:00
Tanner edc312b5be fix: Use textContent for code block conversion to prevent content loss
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-03 03:51:33 +00:00
Tanner d66747fdb1 refactor: Optimize nodes() calls and simplify function in Article
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-03 03:50:10 +00:00
Tanner b798fd7456 fix: Render void elements correctly and copy all attributes
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-03 03:12:51 +00:00
Tanner 721af4beca refactor: Implement recursive rendering to detect and convert code blocks
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-03 02:52:07 +00:00
Tanner 8c0caf1c39 fix: Unwrap single-child wrapper elements in nodes function
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-03 02:46:20 +00:00
Tanner 28e92a057f chore: Add debug log to isCodeBlock function 2025-12-03 02:46:18 +00:00
Tanner 9840859aff fix: Relax isCodeBlock check for nested code elements
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-03 02:37:58 +00:00
Tanner 98e9f64f1f refactor: Refactor nodes logic from useMemo to a regular function 2025-12-03 02:37:56 +00:00
Tanner c78cdf2c4d refactor: Extract code block detection into isCodeBlock function
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-03 01:46:19 +00:00
Tanner d99144f4d8 fix: Detect code blocks nested in pre tags for conversion
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-03 01:43:33 +00:00
Tanner 6060acb676 fix: Show 'Convert Code to Paragraph' button for <code> elements
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-03 01:37:08 +00:00
Tanner 7db174a249 fix: Adjust spacing below comment text content
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-03 01:28:10 +00:00
Tanner 4b10320a03 fix: Wrap text in <pre> blocks to prevent horizontal overflow
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-03 00:58:39 +00:00
Tanner dc3583daaa refactor: Convert 'show more' div to semantic button
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-03 00:50:58 +00:00
Tanner 9681b3b8e9 refactor: Convert collapser span to button for accessibility
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-03 00:48:22 +00:00
Tanner b606516ece refactor: Remove unnecessary useCallback from comment functions
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-02 23:53:40 +00:00
Tanner 2cb524d35e Mark deleted / empty comments 2025-12-02 23:39:24 +00:00
Tanner 2c86b4d144 Add a copy button to the article title 2025-12-02 23:19:31 +00:00
Tanner 66deed7544 fix: Align article title and copy button, correct icon font
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-02 23:19:31 +00:00
Tanner e117ffe56c style: Update copy link button font 2025-12-02 23:19:31 +00:00
Tanner 218c273a6d fix: Improve copy button icon display and alignment
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-02 23:19:31 +00:00
Tanner fec6c897b2 style: Style copy button icon 2025-12-02 23:19:31 +00:00
Tanner adf1f079e4 feat: Use icons for copy link button feedback
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-02 23:19:31 +00:00
Tanner e756545d2c feat: Add button to copy article title and URL to clipboard
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-02 23:19:31 +00:00
Tanner 00f4ee5565 Move static build directory to apiserver/ 2025-12-02 22:38:49 +00:00
Tanner 56ddee2f3e refactor: Iterate through stories in order for prioritized updates
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-02 22:37:58 +00:00
Tanner 14f4b09d12 fix: Unregister service worker
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-02 17:13:52 +00:00
Tanner 2a48b4a28d Revert ScrollToTop component back to class-based 2025-12-02 17:02:03 +00:00
Tanner 42f5614524 Don't setStories every loop iteration 2025-12-02 16:52:32 +00:00
Tanner 06f35fbe8a feat: Add loading progress indicator to Feed
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-12-02 01:20:27 +00:00
Tanner 47d401a536 feat: Add fetching stories placeholder 2025-12-02 01:20:25 +00:00
Tanner d499cdedb0 Misc fixes 2025-12-01 21:07:01 +00:00
Tanner 9fa2699e26 fix: Improve submit error handling on API and refactor client with async/await
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-21 23:02:29 +00:00
Tanner d8a35ac467 fix: Improve error handling for non-JSON server responses in Submit
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-21 22:59:15 +00:00
Tanner ca6b5ce677 feat: Display detailed submission errors to user
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-21 22:56:48 +00:00
Tanner 914fda0f4b feat: Display detailed, expandable connection error in Comments component
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-21 22:51:14 +00:00
Tanner 97e15be797 fix: Conditionally render error details to avoid layout gap
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-21 22:45:58 +00:00
Tanner 1fad0459b8 refactor: Improve article loading error and cache messages 2025-11-21 22:45:54 +00:00
Tanner 8df6cb7b36 fix: Prevent layout shift when error message appears
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-21 22:39:34 +00:00
Tanner 79b857548b feat: Persist new stories and improve layout consistency 2025-11-21 22:39:32 +00:00
Tanner c1a6938a50 feat: Add detailed, expandable error messages to Article component
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-21 22:34:24 +00:00
Tanner af152b8848 feat: Show preload progress on fetch failure
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-21 00:59:14 +00:00
Tanner 36c6e77548 style: Improve error messages and loading text, add spacing to error details 2025-11-21 00:59:12 +00:00
Tanner 913adb0150 fix: Provide detailed error for story fetch failures
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-21 00:50:58 +00:00
Tanner c09ffecdd9 fix: Display network error on API fetch failure
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-21 00:49:14 +00:00
Tanner a9b688309c fix: Provide detailed error messages for network failures
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-21 00:45:59 +00:00
Tanner 9fb0e9a679 feat: Show detailed connection errors in collapsible section 2025-11-21 00:41:57 +00:00
Tanner 7a5cc94d60 feat: Add 10s timeout and early exit for story preloading on error
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-21 00:34:17 +00:00
Tanner cf7c91554d feat: Immediately display stories on first load
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-20 23:02:59 +00:00
Tanner 9ae9ac903e fix: Always fetch full story and update existing in feed
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-20 22:58:44 +00:00
Tanner 2c5147a64d Begin stats API route 2025-11-20 22:25:26 +00:00
Tanner ba374bea66 Ignore aider files 2025-11-20 22:25:20 +00:00
Tanner c724838523 Add debug logging, debug add manual submissions to feed 2025-11-20 21:55:45 +00:00
Tanner 98067ef81f Logging 2025-11-19 19:17:38 +00:00
Tanner 2d8f69a367 fix: Batch story list updates and limit length
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-19 19:17:38 +00:00
Tanner 4cf3bc4186 chore: Add console log for stories 2025-11-19 19:17:38 +00:00
Tanner cabaca6051 fix: Fix infinite loop in Feed by removing stories from useEffect deps
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-19 19:17:38 +00:00
Tanner 90f38f0bcc refactor: Refactor Feed story fetching for improved network resilience
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
2025-11-19 19:17:38 +00:00
Tanner ccde7a1486 chore: Disable story updates and preloading logic 2025-11-19 19:17:38 +00:00
Tanner Collin (aider) d2408dd502 refactor: Refactor dot components to functional 2025-11-19 19:17:38 +00:00
Tanner Collin (aider) 77809ff73b refactor: Refactor Submit component to use hooks 2025-11-19 19:17:38 +00:00
Tanner Collin (aider) ecc69bf611 refactor: Refactor Search component to use hooks 2025-11-19 19:17:38 +00:00
Tanner Collin (aider) 528539f049 refactor: Convert ScrollToTop to functional component with hooks 2025-11-19 19:17:38 +00:00
Tanner Collin (aider) 4cb578ac49 refactor: refactor Results component to functional component 2025-11-19 19:17:38 +00:00
Tanner 28da3cd9f5 Update webclient dependencies 2025-11-19 19:17:38 +00:00
Tanner Collin (aider) d365cad14b refactor: Refactor Feed component to functional with hooks 2025-11-19 19:17:38 +00:00
Tanner Collin (aider) 9300688ceb refactor: Convert Comments class to functional using hooks 2025-11-19 19:17:38 +00:00
Tanner 6996b5f927 refactor: Rename Article component to Comments 2025-11-19 19:17:38 +00:00
Tanner Collin (aider) 135146d8ee refactor: Refactor Article component to use hooks 2025-11-19 19:17:38 +00:00
Tanner Collin (aider) ef74a2d7cf refactor: Convert App class component to functional component 2025-11-19 19:17:38 +00:00
Tanner 6fa02367dd Ignore blank hackernews titles 2025-11-19 19:17:38 +00:00
Tanner 10629200b0 Skip "Removed by moderator" stories 2025-09-27 17:38:50 +00:00
Tanner b09d432c10 Ignore dead and political stories 2025-05-27 18:47:17 +00:00
Tanner 1569c16e70 Fix Better HN api content extraction 2025-02-01 22:39:13 +00:00
Tanner 9f52226d47 Add Better HN as an API backup 2025-02-01 21:42:06 +00:00
Tanner b3eb55e2d1 Bug fixes 2025-02-01 20:31:35 +00:00
Tanner fafed2cf3d Alert on story update error 2024-03-16 20:41:24 +00:00
Tanner 7dc9e3ad16 Adjust score and comment thresholds 2024-03-08 03:08:18 +00:00
Tanner 6c1aced06c Fix deletion script 2024-03-08 03:08:03 +00:00
Tanner 005004ad3d Increase database timeout 2024-02-27 18:48:56 +00:00
Tanner 12bb83ade8 Fix lobsters comment parsing 2024-02-27 18:47:00 +00:00
Tanner 8f70f7a6c2 Move scripts into own folder 2024-02-27 18:32:29 +00:00
Tanner 414dcdcce9 Update readability 2024-02-27 18:32:19 +00:00
Tanner f5ce14a749 Make "dark" theme grey, add "black" theme 2023-09-13 01:19:47 +00:00
Tanner e85f0d8e19 Disable lobsters 2023-09-13 01:02:15 +00:00
Tanner c228ab0635 Replace "indent_level" with "depth" in lobsters API
See:
https://github.com/lobsters/lobsters/commit/fe09e5aa31993e09ed4ad255bb4a359f1e8a2d62
2023-08-31 07:35:44 +00:00
Tanner f11123b441 Handle Lobsters comment parsing TypeErrors
Too lazy to debug this:

2023-08-29 12:56:35,111 - root - INFO - Updating lobsters story: yktkwr, index: 55
Traceback (most recent call last):
  File "src/gevent/greenlet.py", line 854, in gevent._gevent_cgreenlet.Greenlet.run
  File "/home/tanner/qotnews/apiserver/server.py", line 194, in feed_thread
    valid = feed.update_story(story)
  File "/home/tanner/qotnews/apiserver/feed.py", line 74, in update_story
    res = lobsters.story(story['ref'])
  File "/home/tanner/qotnews/apiserver/feeds/lobsters.py", line 103, in story
    s['comments'] = iter_comments(r['comments'])
  File "/home/tanner/qotnews/apiserver/feeds/lobsters.py", line 76, in iter_comments
    parent_stack = parent_stack[:indent-1]
TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'
2023-08-29T12:56:35Z <Greenlet at 0x7f92ad840ae0: feed_thread> failed with TypeError
2023-08-31 07:30:39 +00:00
Tanner 83fea72578 Add Tildes group whitelist 2023-07-13 22:54:36 +00:00
Tanner 3a0f7e79a2 Increase again 2023-06-13 17:11:50 +00:00
Tanner 0cfbf152eb Increase Tildes story score requirement 2023-06-11 01:01:31 +00:00
Tanner b051794a0c Catch all possible Reddit API exceptions 2023-03-15 21:16:37 +00:00
Tanner a56d5e91aa Fix darkmode fullscreen button color 2022-08-11 19:36:36 +00:00
Tanner 8218c849b7 Fix fix-stories bug 2022-08-10 04:06:39 +00:00
Tanner 952c891643 Hide fullscreen button if it's not available 2022-08-10 04:05:25 +00:00
Tanner b6b9dfaabb Add fullscreen mode 2022-08-08 23:21:49 +00:00
Tanner 832f2446a6 Add red theme 2022-08-08 20:14:57 +00:00
Tanner 545f2f300e Write fixed stories to database 2022-07-05 00:57:56 +00:00
Tanner 32e1ed615e Begin script to fix bad gzip text 2022-07-04 20:32:01 +00:00
Tanner 3db01c01f0 Move FEED_LENGTH to settings.py, use for search results 2022-07-04 19:08:24 +00:00
Tanner 0be6f62a39 Small UI changes 2022-07-04 19:08:24 +00:00
Tanner 8802b28b97 Add accept gzip header to readability server 2022-07-04 19:07:31 +00:00
Tanner 9fa9899965 Add test file 2022-07-04 05:56:06 +00:00
Tanner 4816e8a5ed Fix requests text encoding slowness 2022-07-04 05:55:52 +00:00
Tanner 26f03f883b Return search results directly from the server 2022-07-04 04:33:01 +00:00
Tanner b417a44314 Remove Article / Comments, etc thing after name 2022-07-04 04:33:01 +00:00
Tanner e2790ddb76 Remove hard-coded title 2022-06-30 00:12:22 +00:00
Tanner c1aede8f1f Adjust title 2022-06-30 00:05:15 +00:00
Tanner 0d00ec40b0 Change header based on page 2022-06-30 00:00:30 +00:00
Tanner 6cc24281d9 Add index / noindex to client 2022-06-29 23:30:39 +00:00
Tanner d6cf56677b Add noindex meta tag to stories 2022-06-29 23:20:53 +00:00
Tanner dae40b27cb Increase database timeout 2022-06-24 20:50:27 +00:00
Tanner 3d8ed467c9 Update software 2022-05-31 04:24:12 +00:00
Tanner 104e2853af Explain no javascript 2022-05-31 04:23:52 +00:00
Tanner e93230d518 Improve logging, sends tweets to nitter.net 2022-03-05 23:48:46 +00:00
Tanner f16abc2074 Remove outline API 2022-03-05 22:05:29 +00:00
Tanner c946b150ff Include option to disable readerserver 2022-03-05 22:04:25 +00:00
Tanner 63963e2c74 Include option to disable search 2022-03-05 21:58:35 +00:00
Tanner e48846f4c3 Fix search to work with low-RAM server 2022-03-05 21:33:07 +00:00
Tanner bac043d298 Improve logging 2021-09-06 00:21:05 +00:00
Tanner 13d00433cd Add script to reindex search, abstract search API 2021-09-06 00:20:21 +00:00
Tanner 30c73a920a Change the order by which content-type is grabbed 2021-01-30 06:36:02 +00:00
Tanner cf5c1a3d44 Add optional skip and limit to API route 2021-01-18 03:59:33 +00:00
Tanner 214f4c50b8 Remove colons from date string so Python 3.5 can parse 2020-12-15 23:19:50 +00:00
Tanner ce3b986e47 Add Lobsters to feed 2020-12-12 05:26:33 +00:00
Tanner 6a73fa548e Update gitignore 2020-12-11 23:49:45 +00:00
Tanner db92b39cda Increase sqlite lock timeout 2020-11-19 21:38:18 +00:00
Tanner f0946c7d1f Blacklist sec.gov website 2020-11-19 21:37:59 +00:00
Tanner 410499518e Add header to get content type 2020-11-03 20:27:43 +00:00
Tanner c4c3f448b1 Clean code up 2020-11-03 03:45:56 +00:00
Tanner 837bb91bcc Move feed and Praw config to settings.py 2020-11-02 02:26:54 +00:00
Tanner 70a427d533 Fix index.html indentation 2020-11-02 00:38:34 +00:00
Tanner df0a5cd2b3 Fix noscript font color 2020-11-02 00:36:11 +00:00
Tanner a2b60f04ad Remove Whoosh 2020-11-02 00:22:40 +00:00
Tanner 278fbf9a1c Try Hackernews API twice 2020-11-02 00:17:22 +00:00
Tanner 917e6c412f Improve logging 2020-11-02 00:13:43 +00:00
Tanner e5e523dbf5 Fix table width CSS 2020-11-01 00:47:18 +00:00
Tanner 0ad381bb7e Make qotnews work with WaPo 2020-10-29 04:55:34 +00:00
Tanner 81a9e77cd4 Upgrade readability 2020-10-29 01:24:13 +00:00
Tanner 929a4fc491 Show exerpt of hidden comments 2020-10-27 00:41:36 +00:00
Tanner 0ac23028b2 Fix bug with rendering text nodes 2020-10-26 21:58:36 +00:00
Tanner 7ecc7bc6b4 Add instructions to download search server 2020-10-26 21:58:36 +00:00
Tanner dd5b50edf8 Add buttons to collapse / expand comments 2020-10-26 21:57:10 +00:00
Tanner 864e6192db Monkeypatch earlier 2020-10-24 22:30:00 +00:00
Tanner cdb14b5d4e Add a script to delete a story 2020-10-03 23:42:21 +00:00
Tanner 4e1566961a Adjust feeds 2020-10-03 23:41:57 +00:00
Tanner 56302077ad Add buttons to convert <pre> to <p> 2020-10-03 23:23:25 +00:00
Tanner 8e0c11e5e0 Add a line on UI to make search results obvious 2020-08-14 03:58:11 +00:00
Tanner ec859ff64c Adjust content-type request timeout 2020-08-14 03:57:43 +00:00
Tanner 33ac9e63dd Adjust port 2020-08-14 03:57:18 +00:00
Tanner 835c6241ee Delete displayed-attributes when init search 2020-08-14 03:56:47 +00:00
Tanner e05aa1c6ee Remove business subreddit from feed 2020-08-14 03:55:28 +00:00
Tanner 1c420cc49d Update requirements 2020-07-08 05:24:32 +00:00
Tanner 38cfc4bda4 Remove extra logging 2020-07-08 02:36:40 +00:00
Tanner 162142083b Fix crash when HN feed fails 2020-07-08 02:36:40 +00:00
Tanner 02c8bbad20 Remove document img and ignore r/technology 2020-07-08 02:36:40 +00:00
Tanner 5ef0fd120b Tune search rankings and attributes 2020-07-08 02:36:40 +00:00
Tanner 5634cc812c Add more logging 2020-07-08 02:36:40 +00:00
Tanner 3ac032b817 Remove article numbers 2020-07-08 02:36:40 +00:00
Tanner b0c3e9a06d Remove pre-fetching image 2020-07-08 02:36:40 +00:00
Tanner 21c221925d Remove get first image 2020-07-08 02:36:40 +00:00
Tanner 437b1e313b Add requests timeouts and temporary logging 2020-07-08 02:36:40 +00:00
Tanner 64ef3a2a18 Integrate with external MeiliSearch server 2020-07-08 02:36:40 +00:00
Tanner d69e054311 Integrate sqlite database with server 2020-07-08 02:36:40 +00:00
Tanner 5913f894ca Update whoosh migration script 2020-07-08 02:36:40 +00:00
Tanner 894d3654c0 Store ref list in database too 2020-07-08 02:36:40 +00:00
Tanner e97bc4b2c7 Begin initial sqlite conversion 2020-07-08 02:36:40 +00:00
Tanner 8c1ddd4a43 Check if cache is broken 2020-07-08 02:36:40 +00:00
Tanner 490bcd5235 Fall back to ref on manual submission title 2020-07-08 02:36:40 +00:00
Tanner f956656647 Check content-type 2020-07-08 02:36:40 +00:00
Tanner c3c5fa0c0a Remove technology subreddit 2020-07-08 02:36:40 +00:00
Tanner 54f30e20f5 Update tildes parser group tag 2020-07-08 02:36:40 +00:00
Tanner fb99f26dcf Make noscript background white 2020-06-22 20:52:51 +00:00
Tanner 7719acee01 Fix cache load race condition bug 2020-01-28 04:20:48 +00:00
Tanner 4773f9766f Remove preload of news source icons 2020-01-28 04:20:29 +00:00
Tanner ee3df25c63 Remove keys of uncached stories 2020-01-28 04:20:05 +00:00
Tanner de1bcd9abc Fix tildes deleted comment parser error 2020-01-28 04:19:26 +00:00
Tanner 9cc73da33c Add del tag and sort tags 2020-01-04 23:37:41 +00:00
Tanner 593a645089 Fix back/forward scroll jump issue 2020-01-04 23:36:24 +00:00
Tanner d068e6eec4 Add forward button, convert icons to font 2020-01-03 03:45:56 +00:00
Tanner 48b9c67a9b Add style changes to prevent horizontal scrolling 2019-12-22 21:43:33 +00:00
Tanner 957beea2a7 Stop using archive.is on articles (hits CAPTCHAs) 2019-12-15 22:47:33 +00:00
Tanner 999f8b99e8 Fix search result icons 2019-12-14 07:39:25 +00:00
Tanner 114be7a559 Whitelist more html tags 2019-12-14 07:39:10 +00:00
Tanner 60a4e08479 Embed base64 logo directly in source to avoid load 2019-12-02 23:54:02 +00:00
Tanner ad5da72578 Grab comments on manually submitted links 2019-12-02 23:15:51 +00:00
Tanner 393b676791 Sanitize html 2019-12-01 22:18:41 +00:00
Tanner a9dbfa0a6f Decrease feed cache length to 150 2019-12-01 22:18:14 +00:00
Tanner 9053bced58 Add logo for manual submissions 2019-11-14 08:38:11 +00:00
Tanner f11d4ff20c Drop articles more than two days old 2019-11-08 21:50:33 +00:00
Tanner 6ca4a32030 Allow manual submission of articles 2019-11-08 05:55:30 +00:00
Tanner 5482af40e5 Move to gevent production http server 2019-11-08 02:37:57 +00:00
Tanner d6619f188c Handle hostnames better 2019-11-07 22:10:08 +00:00
Tanner dc87026f99 Add subreddit 2019-11-07 22:09:45 +00:00
Tanner 5b58e03dbc Abort previous search requests 2019-11-07 22:08:28 +00:00
Tanner 700fd8d6a6 Get rid of lint warnings 2019-10-22 07:31:59 +00:00
Tanner 9283f8439c Fix Tildes down for maintenance edge case 2019-10-22 05:01:30 +00:00
Tanner 0742432541 Prefetch first images 2019-10-19 07:33:06 +00:00
Tanner 5de8631115 Cache articles in memory for speed 2019-10-18 21:26:22 +00:00
Tanner b09599aa5f Add serviceworker, render logos directly 2019-10-18 05:09:49 +00:00
Tanner 75bbf09143 Fix underlines 2019-10-18 01:20:38 +00:00
Tanner 109ba0eb23 Fix crash from domain and ext check bug 2019-10-16 08:56:31 +00:00
Tanner ca9bed855f Fix copy/paste error, switch to info logging 2019-10-16 05:26:47 +00:00
Tanner 9f60ee7864 Begin README and add license 2019-10-15 16:40:55 -06:00
Tanner 8f8a11954a Archive WSJ articles first, catch KeyboardInterrupt 2019-10-15 21:03:47 +00:00
Tanner c4281ca215 Stop using python keyword id for id 2019-10-15 20:36:20 +00:00
Tanner 6bd3bf1090 Cache all articles in IndexedDB 2019-10-12 23:41:31 +00:00
Tanner f798c06a9b Move archive to Whoosh and add search 2019-10-12 05:32:17 +00:00
Tanner 5f8884a5ca Gitkeep archive directory 2019-10-10 21:55:21 +00:00
Tanner 055439c6db Serve client through apiserver, adding meta info 2019-10-10 21:54:29 +00:00
Tanner d49e686f8f Set title on article and comment pages, add comment anchors 2019-10-10 21:52:28 +00:00
Tanner 536214be1f Fix Tildes comments with unknown authors 2019-10-08 08:01:17 +00:00
Tanner c396a432d8 Archive Bloomberg articles first 2019-10-08 08:00:50 +00:00
Tanner 37283d09dc Gitkeep apiserver data directory 2019-10-08 07:59:30 +00:00
Tanner e82a72e9b7 Add huge margin to bottom of body for better pagescroll 2019-09-24 18:40:22 +00:00
Tanner 9a2854c735 Add site logos, keep displaying news on error 2019-09-24 08:23:14 +00:00
Tanner 8a01d533da Ignore certain files and domains, remove refs 2019-09-24 08:22:06 +00:00
Tanner 1acdd92cbf Ignore new Tildes posts and handle deleted ones 2019-09-24 08:21:26 +00:00
Tanner 37904a467b Handle Reddit PRAW exceptions 2019-09-24 08:20:46 +00:00
Tanner 1682cb8247 Filter out False comments 2019-08-30 06:23:14 +00:00
Tanner b43fed8e44 Settle on serif font, add scroll to top component 2019-08-30 06:22:26 +00:00
Tanner f1c89fcf8b Render reddit markdown, poll tildes better, add utils 2019-08-28 04:13:02 +00:00
Tanner 5f999b6263 Snip deeply nested comments 2019-08-26 01:37:50 +00:00
Tanner b76cbcd046 Try outline.com for reader mode first 2019-08-25 23:49:08 +00:00
Tanner 277595f62f Add favicons to webclient 2019-08-25 23:48:24 +00:00
Tanner 64d29c8e77 Add a button to toggle between article and comments 2019-08-25 08:50:49 +00:00
Tanner 7b97ca6786 Add fonts, fix styling issues 2019-08-25 07:46:58 +00:00
Tanner fb49dde034 Fix tildes comments parsing bug 2019-08-25 07:46:22 +00:00
Tanner 2cf1c44fb9 Clear localstorage cache and add slogan 2019-08-25 01:25:28 +00:00
Tanner 5ff27591a7 Add tildes to feeds 2019-08-25 00:36:26 +00:00
Tanner 3842ec24d9 Add reddit to feeds 2019-08-24 21:37:43 +00:00
Tanner 052ba9fa4e Remove DOMPurify import 2019-08-24 08:49:53 +00:00
Tanner 8e73f746c3 Abstract api server feeds 2019-08-24 08:49:11 +00:00
Tanner 44db099614 Stop running DOMPurify on reader server 2019-08-24 05:09:02 +00:00
Tanner 351b8a00ee Write news stories to disk 2019-08-24 05:07:16 +00:00
Tanner 7bb72f9b96 Finish prototype web client 2019-08-24 05:04:51 +00:00
Tanner d039191232 Finish prototype api server 2019-08-23 08:23:48 +00:00
Tanner 13e99bb52e Figure out .gitignores 2019-08-23 08:23:26 +00:00
Tanner 895be96d7a Change reader server useragent and port 2019-08-23 08:21:25 +00:00
Tanner 47398c1a6e Prototype readability server 2019-08-20 21:49:06 -06:00
Tanner 5faa4b91fa Initial commit 2019-08-20 21:48:55 -06:00
55 changed files with 7211 additions and 4506 deletions
+1
View File
@@ -0,0 +1 @@
.aider*
+1 -1
View File
@@ -1,6 +1,6 @@
The MIT License (MIT) The MIT License (MIT)
Copyright (c) 2019 Tanner Collin Copyright (c) 2019 Tanner (tanner.vc)
Permission is hereby granted, free of charge, to any person obtaining a copy Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal of this software and associated documentation files (the "Software"), to deal
+6 -8
View File
@@ -20,7 +20,7 @@ $ sudo apt install yarn
Clone this repo: Clone this repo:
```text ```text
$ git clone https://gogs.tannercollin.com/tanner/qotnews.git $ git clone https://git.tanner.vc/tanner/qotnews.git
$ cd qotnews $ cd qotnews
``` ```
@@ -35,7 +35,7 @@ $ source env/bin/activate
(env) $ pip install -r requirements.txt (env) $ pip install -r requirements.txt
``` ```
Configure Praw for your Reddit account: Configure Praw for your Reddit account (optional):
* Go to https://www.reddit.com/prefs/apps * Go to https://www.reddit.com/prefs/apps
* Click "Create app" * Click "Create app"
@@ -44,16 +44,14 @@ Configure Praw for your Reddit account:
* Description: blank * Description: blank
* About URL: blank * About URL: blank
* Redirect URL: your GitHub profile * Redirect URL: your GitHub profile
* Submit, copy the client ID and client secret into `praw.ini`: * Submit, copy the client ID and client secret into `settings.py` below
```text ```text
(env) $ vim praw.ini (env) $ vim settings.py.example
[bot]
client_id=paste here
client_secret=paste here
user_agent=script by github/your-username-here
``` ```
Edit it and save it as `settings.py`.
Now you can run the server: Now you can run the server:
```text ```text
+2 -1
View File
@@ -105,8 +105,9 @@ ENV/
# DB # DB
db.sqlite3 db.sqlite3
praw.ini settings.py
data.db data.db
data.db.bak data.db.bak
data/archive/* data/archive/*
data/backup/*
qotnews.sqlite qotnews.sqlite
-52
View File
@@ -1,52 +0,0 @@
from whoosh.analysis import StemmingAnalyzer, CharsetFilter, NgramFilter
from whoosh.index import create_in, open_dir, exists_in
from whoosh.fields import *
from whoosh.qparser import QueryParser
from whoosh.support.charset import accent_map
analyzer = StemmingAnalyzer() | CharsetFilter(accent_map) | NgramFilter(minsize=3)
title_field = TEXT(analyzer=analyzer, stored=True)
id_field = ID(unique=True, stored=True)
schema = Schema(
id=id_field,
title=title_field,
story=STORED,
)
ARCHIVE_LOCATION = 'data/archive'
ix = None
def init():
global ix
if exists_in(ARCHIVE_LOCATION):
ix = open_dir(ARCHIVE_LOCATION)
else:
ix = create_in(ARCHIVE_LOCATION, schema)
def update(story):
writer = ix.writer()
writer.update_document(
id=story['id'],
title=story['title'],
story=story,
)
writer.commit()
def get_story(sid):
with ix.searcher() as searcher:
result = searcher.document(id=sid)
return result['story'] if result else None
def search(search):
with ix.searcher() as searcher:
query = QueryParser('title', ix.schema).parse(search)
results = searcher.search(query)
stories = [r['story'] for r in results]
for s in stories:
s.pop('text', '')
s.pop('comments', '')
return stories
+19 -3
View File
@@ -5,7 +5,7 @@ from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker from sqlalchemy.orm import sessionmaker
from sqlalchemy.exc import IntegrityError from sqlalchemy.exc import IntegrityError
engine = create_engine('sqlite:///data/qotnews.sqlite') engine = create_engine('sqlite:///data/qotnews.sqlite', connect_args={'timeout': 360})
Session = sessionmaker(bind=engine) Session = sessionmaker(bind=engine)
Base = declarative_base() Base = declarative_base()
@@ -68,12 +68,13 @@ def get_reflist(amount):
q = session.query(Reflist).order_by(Reflist.rid.desc()).limit(amount) q = session.query(Reflist).order_by(Reflist.rid.desc()).limit(amount)
return [dict(ref=x.ref, sid=x.sid, source=x.source) for x in q.all()] return [dict(ref=x.ref, sid=x.sid, source=x.source) for x in q.all()]
def get_stories(amount): def get_stories(amount, skip=0):
session = Session() session = Session()
q = session.query(Reflist, Story.meta_json).\ q = session.query(Reflist, Story.meta_json).\
order_by(Reflist.rid.desc()).\ order_by(Reflist.rid.desc()).\
join(Story).\ join(Story).\
filter(Story.title != None).\ filter(Story.title != None).\
offset(skip).\
limit(amount) limit(amount)
return [x[1] for x in q] return [x[1] for x in q]
@@ -100,7 +101,22 @@ def del_ref(ref):
finally: finally:
session.close() session.close()
def count_stories():
try:
session = Session()
return session.query(Story).count()
finally:
session.close()
def get_story_list():
try:
session = Session()
return session.query(Story.sid).all()
finally:
session.close()
if __name__ == '__main__': if __name__ == '__main__':
init() init()
print(get_story_by_ref('hgi3sy')) #print(get_story_by_ref('hgi3sy'))
print(len(get_reflist(99999)))
+47 -40
View File
@@ -7,46 +7,40 @@ import requests
import time import time
from bs4 import BeautifulSoup from bs4 import BeautifulSoup
from feeds import hackernews, reddit, tildes, manual import settings
from feeds import hackernews, reddit, tildes, manual, lobsters
import utils
OUTLINE_API = 'https://api.outline.com/v3/parse_article' INVALID_DOMAINS = ['youtube.com', 'bloomberg.com', 'wsj.com', 'sec.gov']
ARCHIVE_API = 'https://archive.fo/submit/'
READ_API = 'http://127.0.0.1:33843'
INVALID_DOMAINS = ['youtube.com', 'bloomberg.com', 'wsj.com']
TWO_DAYS = 60*60*24*2 TWO_DAYS = 60*60*24*2
def list(): def list():
feed = [] feed = []
feed += [(x, 'hackernews') for x in hackernews.feed()[:15]] if settings.NUM_HACKERNEWS:
feed += [(x, 'reddit') for x in reddit.feed()[:10]] feed += [(x, 'hackernews') for x in hackernews.feed()[:settings.NUM_HACKERNEWS]]
feed += [(x, 'tildes') for x in tildes.feed()[:5]]
if settings.NUM_LOBSTERS:
feed += [(x, 'lobsters') for x in lobsters.feed()[:settings.NUM_LOBSTERS]]
if settings.NUM_REDDIT:
feed += [(x, 'reddit') for x in reddit.feed()[:settings.NUM_REDDIT]]
if settings.NUM_TILDES:
feed += [(x, 'tildes') for x in tildes.feed()[:settings.NUM_TILDES]]
return feed return feed
def get_article(url): def get_article(url):
try: if not settings.READER_URL:
params = {'source_url': url} logging.info('Readerserver not configured, aborting.')
headers = {'Referer': 'https://outline.com/'}
r = requests.get(OUTLINE_API, params=params, headers=headers, timeout=20)
if r.status_code == 429:
logging.info('Rate limited by outline, sleeping 30s and skipping...')
time.sleep(30)
return '' return ''
if r.status_code != 200:
raise Exception('Bad response code ' + str(r.status_code))
html = r.json()['data']['html']
if 'URL is not supported by Outline' in html:
raise Exception('URL not supported by Outline')
return html
except KeyboardInterrupt:
raise
except BaseException as e:
logging.error('Problem outlining article: {}'.format(str(e)))
logging.info('Trying our server instead...') if url.startswith('https://twitter.com'):
logging.info('Replacing twitter.com url with nitter.net')
url = url.replace('twitter.com', 'nitter.net')
try: try:
r = requests.post(READ_API, data=dict(url=url), timeout=10) r = requests.post(settings.READER_URL, data=dict(url=url), timeout=20)
if r.status_code != 200: if r.status_code != 200:
raise Exception('Bad response code ' + str(r.status_code)) raise Exception('Bad response code ' + str(r.status_code))
return r.text return r.text
@@ -57,31 +51,39 @@ def get_article(url):
return '' return ''
def get_content_type(url): def get_content_type(url):
try:
headers = {'User-Agent': 'Twitterbot/1.0'}
return requests.get(url, headers=headers, timeout=2).headers['content-type']
except:
pass
try: try:
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:77.0) Gecko/20100101 Firefox/77.0'} headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:77.0) Gecko/20100101 Firefox/77.0'}
return requests.get(url, headers=headers, timeout=10).headers['content-type'] return requests.get(url, headers=headers, timeout=5).headers['content-type']
except: except:
return '' return ''
try:
headers = {
'User-Agent': 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)',
'X-Forwarded-For': '66.249.66.1',
}
return requests.get(url, headers=headers, timeout=10).headers['content-type']
except:
pass
def update_story(story, is_manual=False): def update_story(story, is_manual=False):
res = {} res = {}
logging.info('Updating story ' + str(story['ref'])) try:
if story['source'] == 'hackernews': if story['source'] == 'hackernews':
res = hackernews.story(story['ref']) res = hackernews.story(story['ref'])
elif story['source'] == 'lobsters':
res = lobsters.story(story['ref'])
elif story['source'] == 'reddit': elif story['source'] == 'reddit':
res = reddit.story(story['ref']) res = reddit.story(story['ref'])
elif story['source'] == 'tildes': elif story['source'] == 'tildes':
res = tildes.story(story['ref']) res = tildes.story(story['ref'])
elif story['source'] == 'manual': elif story['source'] == 'manual':
res = manual.story(story['ref']) res = manual.story(story['ref'])
except BaseException as e:
utils.alert_tanner('Problem updating {} story, ref {}: {}'.format(story['source'], story['ref'], str(e)))
logging.exception(e)
return False
if res: if res:
story.update(res) # join dicts story.update(res) # join dicts
@@ -90,11 +92,10 @@ def update_story(story, is_manual=False):
return False return False
if story['date'] and not is_manual and story['date'] + TWO_DAYS < time.time(): if story['date'] and not is_manual and story['date'] + TWO_DAYS < time.time():
logging.info('Story too old, removing') logging.info('Story too old, removing. Date: {}'.format(story['date']))
return False return False
if story.get('url', '') and not story.get('text', ''): if story.get('url', '') and not story.get('text', ''):
logging.info('inside if')
if not get_content_type(story['url']).startswith('text/'): if not get_content_type(story['url']).startswith('text/'):
logging.info('URL invalid file type / content type:') logging.info('URL invalid file type / content type:')
logging.info(story['url']) logging.info(story['url'])
@@ -105,6 +106,12 @@ def update_story(story, is_manual=False):
logging.info(story['url']) logging.info(story['url'])
return False return False
if 'trump' in story['title'].lower() or 'musk' in story['title'].lower() or 'Removed by moderator' in story['title']:
logging.info('Trump / Musk / removed story, skipping')
logging.info(story['url'])
return False
logging.info('Getting article ' + story['url']) logging.info('Getting article ' + story['url'])
story['text'] = get_article(story['url']) story['text'] = get_article(story['url'])
if not story['text']: return False if not story['text']: return False
@@ -122,7 +129,7 @@ if __name__ == '__main__':
#print(get_article('https://www.bloomberg.com/news/articles/2019-09-23/xi-s-communists-under-pressure-as-high-prices-hit-china-workers')) #print(get_article('https://www.bloomberg.com/news/articles/2019-09-23/xi-s-communists-under-pressure-as-high-prices-hit-china-workers'))
a = get_article('https://blog.joinmastodon.org/2019/10/mastodon-3.0/') a = get_content_type('https://tefkos.comminfo.rutgers.edu/Courses/e530/Readings/Beal%202008%20full%20text%20searching.pdf')
print(a) print(a)
print('done') print('done')
+96 -10
View File
@@ -12,7 +12,8 @@ import requests
from utils import clean from utils import clean
API_TOPSTORIES = lambda x: 'https://hacker-news.firebaseio.com/v0/topstories.json' API_TOPSTORIES = lambda x: 'https://hacker-news.firebaseio.com/v0/topstories.json'
API_ITEM = lambda x : 'https://hn.algolia.com/api/v1/items/{}'.format(x) ALG_API_ITEM = lambda x : 'https://hn.algolia.com/api/v1/items/{}'.format(x)
BHN_API_ITEM = lambda x : 'https://api.hnpwa.com/v0/item/{}.json'.format(x)
SITE_LINK = lambda x : 'https://news.ycombinator.com/item?id={}'.format(x) SITE_LINK = lambda x : 'https://news.ycombinator.com/item?id={}'.format(x)
SITE_AUTHOR_LINK = lambda x : 'https://news.ycombinator.com/user?id={}'.format(x) SITE_AUTHOR_LINK = lambda x : 'https://news.ycombinator.com/user?id={}'.format(x)
@@ -25,6 +26,16 @@ def api(route, ref=None):
return r.json() return r.json()
except KeyboardInterrupt: except KeyboardInterrupt:
raise raise
except BaseException as e:
logging.error('Problem hitting hackernews API: {}, trying again'.format(str(e)))
try:
r = requests.get(route(ref), timeout=15)
if r.status_code != 200:
raise Exception('Bad response code ' + str(r.status_code))
return r.json()
except KeyboardInterrupt:
raise
except BaseException as e: except BaseException as e:
logging.error('Problem hitting hackernews API: {}'.format(str(e))) logging.error('Problem hitting hackernews API: {}'.format(str(e)))
return False return False
@@ -32,7 +43,7 @@ def api(route, ref=None):
def feed(): def feed():
return [str(x) for x in api(API_TOPSTORIES) or []] return [str(x) for x in api(API_TOPSTORIES) or []]
def comment(i): def alg_comment(i):
if 'author' not in i: if 'author' not in i:
return False return False
@@ -41,21 +52,25 @@ def comment(i):
c['score'] = i.get('points', 0) c['score'] = i.get('points', 0)
c['date'] = i.get('created_at_i', 0) c['date'] = i.get('created_at_i', 0)
c['text'] = clean(i.get('text', '') or '') c['text'] = clean(i.get('text', '') or '')
c['comments'] = [comment(j) for j in i['children']] c['comments'] = [alg_comment(j) for j in i['children']]
c['comments'] = list(filter(bool, c['comments'])) c['comments'] = list(filter(bool, c['comments']))
return c return c
def comment_count(i): def alg_comment_count(i):
alive = 1 if i['author'] else 0 alive = 1 if i['author'] else 0
return sum([comment_count(c) for c in i['comments']]) + alive return sum([alg_comment_count(c) for c in i['comments']]) + alive
def story(ref): def alg_story(ref):
r = api(API_ITEM, ref) r = api(ALG_API_ITEM, ref)
if not r: return False if not r:
logging.info('Bad Algolia Hackernews API response.')
return None
if 'deleted' in r: if 'deleted' in r:
logging.info('Story was deleted.')
return False return False
elif r.get('type', '') != 'story': elif r.get('type', '') != 'story':
logging.info('Type "{}" is not "story".'.format(r.get('type', '')))
return False return False
s = {} s = {}
@@ -66,17 +81,88 @@ def story(ref):
s['title'] = r.get('title', '') s['title'] = r.get('title', '')
s['link'] = SITE_LINK(ref) s['link'] = SITE_LINK(ref)
s['url'] = r.get('url', '') s['url'] = r.get('url', '')
s['comments'] = [comment(i) for i in r['children']] s['comments'] = [alg_comment(i) for i in r['children']]
s['comments'] = list(filter(bool, s['comments'])) s['comments'] = list(filter(bool, s['comments']))
s['num_comments'] = comment_count(s) - 1 s['num_comments'] = alg_comment_count(s) - 1
if 'text' in r and r['text']: if 'text' in r and r['text']:
s['text'] = clean(r['text'] or '') s['text'] = clean(r['text'] or '')
return s return s
def bhn_comment(i):
if 'user' not in i:
return False
c = {}
c['author'] = i.get('user', '')
c['score'] = 0 # Not present?
c['date'] = i.get('time', 0)
c['text'] = clean(i.get('content', '') or '')
c['comments'] = [bhn_comment(j) for j in i['comments']]
c['comments'] = list(filter(bool, c['comments']))
return c
def bhn_story(ref):
r = api(BHN_API_ITEM, ref)
if not r:
logging.info('Bad BetterHN Hackernews API response.')
return None
if 'deleted' in r: # TODO: verify
logging.info('Story was deleted.')
return False
elif r.get('dead', False):
logging.info('Story was deleted.')
return False
elif r.get('type', '') != 'link':
logging.info('Type "{}" is not "link".'.format(r.get('type', '')))
return False
s = {}
s['author'] = r.get('user', '')
s['author_link'] = SITE_AUTHOR_LINK(r.get('user', ''))
s['score'] = r.get('points', 0)
s['date'] = r.get('time', 0)
s['title'] = r.get('title', '')
s['link'] = SITE_LINK(ref)
s['url'] = r.get('url', '')
if s['url'].startswith('item'):
s['url'] = SITE_LINK(ref)
s['comments'] = [bhn_comment(i) for i in r['comments']]
s['comments'] = list(filter(bool, s['comments']))
s['num_comments'] = r.get('comments_count', 0)
if 'content' in r and r['content']:
s['text'] = clean(r['content'] or '')
return s
def story(ref):
s = alg_story(ref)
if s is None:
s = bhn_story(ref)
if not s:
return False
if not s['title']:
return False
if s['score'] < 25 and s['num_comments'] < 10:
logging.info('Score ({}) or num comments ({}) below threshold.'.format(s['score'], s['num_comments']))
return False
return s
# scratchpad so I can quickly develop the parser # scratchpad so I can quickly develop the parser
if __name__ == '__main__': if __name__ == '__main__':
print(feed()) print(feed())
#print(story(20763961)) #print(story(20763961))
#print(story(20802050)) #print(story(20802050))
#print(story(42899834)) # type "job"
#print(story(42900076)) # Ask HN
#print(story(42898201)) # Show HN
#print(story(42899703)) # normal
print(story(42902678)) # bad title?
+120
View File
@@ -0,0 +1,120 @@
import logging
logging.basicConfig(
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
level=logging.DEBUG)
if __name__ == '__main__':
import sys
sys.path.insert(0,'.')
import requests
from datetime import datetime
from utils import clean
API_HOTTEST = lambda x: 'https://lobste.rs/hottest.json'
API_ITEM = lambda x : 'https://lobste.rs/s/{}.json'.format(x)
SITE_LINK = lambda x : 'https://lobste.rs/s/{}'.format(x)
SITE_AUTHOR_LINK = lambda x : 'https://lobste.rs/u/{}'.format(x)
def api(route, ref=None):
try:
r = requests.get(route(ref), timeout=5)
if r.status_code != 200:
raise Exception('Bad response code ' + str(r.status_code))
return r.json()
except KeyboardInterrupt:
raise
except BaseException as e:
logging.error('Problem hitting lobsters API: {}, trying again'.format(str(e)))
try:
r = requests.get(route(ref), timeout=15)
if r.status_code != 200:
raise Exception('Bad response code ' + str(r.status_code))
return r.json()
except KeyboardInterrupt:
raise
except BaseException as e:
logging.error('Problem hitting lobsters API: {}'.format(str(e)))
return False
def feed():
return [x['short_id'] for x in api(API_HOTTEST) or []]
def unix(date_str):
date_str = date_str.replace(':', '')
return int(datetime.strptime(date_str, '%Y-%m-%dT%H%M%S.%f%z').timestamp())
def make_comment(i):
c = {}
try:
c['author'] = i['commenting_user']
except KeyError:
c['author'] = ''
c['score'] = i.get('score', 0)
try:
c['date'] = unix(i['created_at'])
except KeyError:
c['date'] = 0
c['text'] = clean(i.get('comment', '') or '')
c['comments'] = []
return c
def iter_comments(flat_comments):
nested_comments = []
parent_stack = []
for comment in flat_comments:
c = make_comment(comment)
indent = comment['depth']
if indent == 0:
nested_comments.append(c)
parent_stack = [c]
else:
parent_stack = parent_stack[:indent]
p = parent_stack[-1]
p['comments'].append(c)
parent_stack.append(c)
return nested_comments
def story(ref):
r = api(API_ITEM, ref)
if not r:
logging.info('Bad Lobsters API response.')
return False
s = {}
try:
s['author'] = r['submitter_user']
s['author_link'] = SITE_AUTHOR_LINK(s['author'])
except KeyError:
s['author'] = ''
s['author_link'] = ''
s['score'] = r.get('score', 0)
try:
s['date'] = unix(r['created_at'])
except KeyError:
s['date'] = 0
s['title'] = r.get('title', '')
s['link'] = SITE_LINK(ref)
s['url'] = r.get('url', '')
s['comments'] = iter_comments(r['comments'])
s['num_comments'] = r['comment_count']
if s['score'] < 15 and s['num_comments'] < 10:
logging.info('Score ({}) or num comments ({}) below threshold.'.format(s['score'], s['num_comments']))
return False
if 'description' in r and r['description']:
s['text'] = clean(r['description'] or '')
return s
# scratchpad so I can quickly develop the parser
if __name__ == '__main__':
#print(feed())
import json
print(json.dumps(story('fzvd1v'), indent=4))
#print(json.dumps(story('ixyv5u'), indent=4))
+9 -4
View File
@@ -7,12 +7,15 @@ import requests
import time import time
from bs4 import BeautifulSoup from bs4 import BeautifulSoup
USER_AGENT = 'Twitterbot/1.0' USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:77.0) Gecko/20100101 Firefox/77.0'
def api(route): def api(route):
try: try:
headers = {'User-Agent': USER_AGENT} headers = {
r = requests.get(route, headers=headers, timeout=5) 'User-Agent': USER_AGENT,
'X-Forwarded-For': '66.249.66.1',
}
r = requests.get(route, headers=headers, timeout=10)
if r.status_code != 200: if r.status_code != 200:
raise Exception('Bad response code ' + str(r.status_code)) raise Exception('Bad response code ' + str(r.status_code))
return r.text return r.text
@@ -24,7 +27,9 @@ def api(route):
def story(ref): def story(ref):
html = api(ref) html = api(ref)
if not html: return False if not html:
logging.info('Bad http GET response.')
return False
soup = BeautifulSoup(html, features='html.parser') soup = BeautifulSoup(html, features='html.parser')
+18 -12
View File
@@ -12,25 +12,28 @@ from praw.exceptions import PRAWException
from praw.models import MoreComments from praw.models import MoreComments
from prawcore.exceptions import PrawcoreException from prawcore.exceptions import PrawcoreException
import settings
from utils import render_md, clean from utils import render_md, clean
SUBREDDITS = 'Economics+AcademicPhilosophy+DepthHub+Foodforthought+HistoryofIdeas+LaymanJournals+PhilosophyofScience+PoliticsPDFs+Scholar+StateOfTheUnion+TheAgora+TrueFilm+TrueReddit+UniversityofReddit+culturalstudies+hardscience+indepthsports+indepthstories+ludology+neurophilosophy+resilientcommunities+worldevents'
SITE_LINK = lambda x : 'https://old.reddit.com{}'.format(x) SITE_LINK = lambda x : 'https://old.reddit.com{}'.format(x)
SITE_AUTHOR_LINK = lambda x : 'https://old.reddit.com/u/{}'.format(x) SITE_AUTHOR_LINK = lambda x : 'https://old.reddit.com/u/{}'.format(x)
reddit = praw.Reddit('bot') if settings.NUM_REDDIT:
reddit = praw.Reddit(
client_id=settings.REDDIT_CLIENT_ID,
client_secret=settings.REDDIT_CLIENT_SECRET,
user_agent=settings.REDDIT_USER_AGENT,
)
subs = '+'.join(settings.SUBREDDITS)
def feed(): def feed():
try: try:
return [x.id for x in reddit.subreddit(SUBREDDITS).hot()] return [x.id for x in reddit.subreddit(subs).hot()]
except KeyboardInterrupt: except KeyboardInterrupt:
raise raise
except PRAWException as e: except BaseException as e:
logging.error('Problem hitting reddit API: {}'.format(str(e))) logging.critical('Problem hitting reddit API: {}'.format(str(e)))
return []
except PrawcoreException as e:
logging.error('Problem hitting reddit API: {}'.format(str(e)))
return [] return []
def comment(i): def comment(i):
@@ -53,7 +56,9 @@ def comment(i):
def story(ref): def story(ref):
try: try:
r = reddit.submission(ref) r = reddit.submission(ref)
if not r: return False if not r:
logging.info('Bad Reddit API response.')
return False
s = {} s = {}
s['author'] = r.author.name if r.author else '[Deleted]' s['author'] = r.author.name if r.author else '[Deleted]'
@@ -68,6 +73,7 @@ def story(ref):
s['num_comments'] = r.num_comments s['num_comments'] = r.num_comments
if s['score'] < 25 and s['num_comments'] < 10: if s['score'] < 25 and s['num_comments'] < 10:
logging.info('Score ({}) or num comments ({}) below threshold.'.format(s['score'], s['num_comments']))
return False return False
if r.selftext: if r.selftext:
@@ -78,10 +84,10 @@ def story(ref):
except KeyboardInterrupt: except KeyboardInterrupt:
raise raise
except PRAWException as e: except PRAWException as e:
logging.error('Problem hitting reddit API: {}'.format(str(e))) logging.critical('Problem hitting reddit API: {}'.format(str(e)))
return False return False
except PrawcoreException as e: except PrawcoreException as e:
logging.error('Problem hitting reddit API: {}'.format(str(e))) logging.critical('Problem hitting reddit API: {}'.format(str(e)))
return False return False
# scratchpad so I can quickly develop the parser # scratchpad so I can quickly develop the parser
+27 -8
View File
@@ -16,7 +16,7 @@ from utils import clean
# cache the topic groups to prevent redirects # cache the topic groups to prevent redirects
group_lookup = {} group_lookup = {}
USER_AGENT = 'qotnews scraper (github:tannercollin)' USER_AGENT = 'qotnews scraper (github:tanner37)'
API_TOPSTORIES = lambda : 'https://tildes.net' API_TOPSTORIES = lambda : 'https://tildes.net'
API_ITEM = lambda x : 'https://tildes.net/shortener/{}'.format(x) API_ITEM = lambda x : 'https://tildes.net/shortener/{}'.format(x)
@@ -34,7 +34,7 @@ def api(route):
except KeyboardInterrupt: except KeyboardInterrupt:
raise raise
except BaseException as e: except BaseException as e:
logging.error('Problem hitting tildes website: {}'.format(str(e))) logging.critical('Problem hitting tildes website: {}'.format(str(e)))
return False return False
def feed(): def feed():
@@ -71,11 +71,15 @@ def story(ref):
html = api(SITE_LINK(group_lookup[ref], ref)) html = api(SITE_LINK(group_lookup[ref], ref))
else: else:
html = api(API_ITEM(ref)) html = api(API_ITEM(ref))
if not html: return False if not html:
logging.info('Bad Tildes API response.')
return False
soup = BeautifulSoup(html, features='html.parser') soup = BeautifulSoup(html, features='html.parser')
a = soup.find('article', class_='topic-full') a = soup.find('article', class_='topic-full')
if a is None: return False if a is None:
logging.info('Tildes <article> element not found.')
return False
h = a.find('header') h = a.find('header')
lu = h.find('a', class_='link-user') lu = h.find('a', class_='link-user')
@@ -83,6 +87,7 @@ def story(ref):
error = a.find('div', class_='text-error') error = a.find('div', class_='text-error')
if error: if error:
if 'deleted' in error.string or 'removed' in error.string: if 'deleted' in error.string or 'removed' in error.string:
logging.info('Article was deleted or removed.')
return False return False
s = {} s = {}
@@ -102,7 +107,21 @@ def story(ref):
ch = a.find('header', class_='topic-comments-header') ch = a.find('header', class_='topic-comments-header')
s['num_comments'] = int(ch.h2.string.split(' ')[0]) if ch else 0 s['num_comments'] = int(ch.h2.string.split(' ')[0]) if ch else 0
if s['score'] < 8 and s['num_comments'] < 6: if s['group'].split('.')[0] not in [
'~arts',
'~comp',
'~creative',
'~design',
'~engineering',
'~finance',
'~science',
'~tech',
]:
logging.info('Group ({}) not in whitelist.'.format(s['group']))
return False
if s['score'] < 15 and s['num_comments'] < 10:
logging.info('Score ({}) or num comments ({}) below threshold.'.format(s['score'], s['num_comments']))
return False return False
td = a.find('div', class_='topic-full-text') td = a.find('div', class_='topic-full-text')
@@ -113,7 +132,7 @@ def story(ref):
# scratchpad so I can quickly develop the parser # scratchpad so I can quickly develop the parser
if __name__ == '__main__': if __name__ == '__main__':
#print(feed()) print(feed())
#normal = story('gxt') #normal = story('gxt')
#print(normal) #print(normal)
#no_comments = story('gxr') #no_comments = story('gxr')
@@ -122,8 +141,8 @@ if __name__ == '__main__':
#print(self_post) #print(self_post)
#li_comment = story('gqx') #li_comment = story('gqx')
#print(li_comment) #print(li_comment)
broken = story('q4y') #broken = story('q4y')
print(broken) #print(broken)
# make sure there's no self-reference # make sure there's no self-reference
#import copy #import copy
-26
View File
@@ -1,26 +0,0 @@
import shelve
import archive
archive.init()
#with shelve.open('data/data') as db:
# to_delete = []
#
# for s in db.values():
# if 'title' in s:
# archive.update(s)
# if 'id' in s:
# to_delete.append(s['id'])
#
# for id in to_delete:
# del db[id]
#
# for s in db['news_cache'].values():
# if 'title' in s:
# archive.update(s)
#with shelve.open('data/whoosh') as db:
# for s in db['news_cache'].values():
# if 'title' in s and not archive.get_story(s['id']):
# archive.update(s)
-74
View File
@@ -1,74 +0,0 @@
import archive
import database
import search
import json
import requests
database.init()
archive.init()
search.init()
count = 0
def database_del_story_by_ref(ref):
try:
session = database.Session()
session.query(database.Story).filter(database.Story.ref==ref).delete()
session.commit()
except:
session.rollback()
raise
finally:
session.close()
def search_del_story(sid):
try:
r = requests.delete(search.MEILI_URL + 'indexes/qotnews/documents/'+sid, timeout=2)
if r.status_code != 202:
raise Exception('Bad response code ' + str(r.status_code))
return r.json()
except KeyboardInterrupt:
raise
except BaseException as e:
logging.error('Problem deleting MeiliSearch story: {}'.format(str(e)))
return False
with archive.ix.searcher() as searcher:
print('count all', searcher.doc_count_all())
print('count', searcher.doc_count())
for doc in searcher.documents():
try:
print('num', count, 'id', doc['id'])
count += 1
story = doc['story']
story.pop('img', None)
if 'reddit.com/r/technology' in story['link']:
print('skipping r/technology')
continue
try:
database.put_story(story)
except database.IntegrityError:
print('collision!')
old_story = database.get_story_by_ref(story['ref'])
old_story = json.loads(old_story.full_json)
if story['num_comments'] > old_story['num_comments']:
print('more comments, replacing')
database_del_story_by_ref(story['ref'])
database.put_story(story)
search_del_story(old_story['id'])
else:
print('fewer comments, skipping')
continue
search.put_story(story)
print()
except KeyboardInterrupt:
break
except BaseException as e:
print('skipping', doc['id'])
print('reason:', e)
-4
View File
@@ -1,4 +0,0 @@
[bot]
client_id=
client_secret=
user_agent=
+67
View File
@@ -0,0 +1,67 @@
import logging
logging.basicConfig(
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
level=logging.INFO)
import database
from sqlalchemy import select
import search
import sys
import time
import json
import requests
from bs4 import BeautifulSoup
database.init()
search.init()
BATCH_SIZE = 1000
def put_stories(stories):
return search.meili_api(requests.post, 'indexes/qotnews/documents', stories)
def get_update(update_id):
return search.meili_api(requests.get, 'tasks/{}'.format(update_id))
if __name__ == '__main__':
num_stories = database.count_stories()
print('Reindex {} stories?'.format(num_stories))
print('Press ENTER to continue, ctrl-c to cancel')
input()
story_list = database.get_story_list()
count = 1
while len(story_list):
stories = []
for _ in range(BATCH_SIZE):
try:
sid = story_list.pop()
except IndexError:
break
story = database.get_story(sid)
print('Indexing {}/{} id: {} title: {}'.format(count, num_stories, sid[0], story.title))
story_obj = json.loads(story.full_json)
story_obj.pop('comments', False)
if 'text' in story_obj and story_obj['text']:
soup = BeautifulSoup(story_obj['text'], 'html.parser')
story_obj['text'] = soup.get_text()
stories.append(story_obj)
count += 1
res = put_stories(stories)
update_id = res['taskUid']
print('Waiting for processing', end='')
while get_update(update_id)['status'] != 'succeeded':
time.sleep(0.5)
print('.', end='', flush=True)
print()
print('Done.')
+1 -1
View File
@@ -8,6 +8,7 @@ Flask==1.1.2
Flask-Cors==3.0.8 Flask-Cors==3.0.8
gevent==20.6.2 gevent==20.6.2
greenlet==0.4.16 greenlet==0.4.16
humanize==4.10.0
idna==2.10 idna==2.10
itsdangerous==1.1.0 itsdangerous==1.1.0
Jinja2==2.11.2 Jinja2==2.11.2
@@ -25,6 +26,5 @@ urllib3==1.25.9
webencodings==0.5.1 webencodings==0.5.1
websocket-client==0.57.0 websocket-client==0.57.0
Werkzeug==1.0.1 Werkzeug==1.0.1
Whoosh==2.7.4
zope.event==4.4 zope.event==4.4
zope.interface==5.1.0 zope.interface==5.1.0
@@ -1,6 +1,8 @@
import database import database
import search import search
import sys import sys
import settings
import logging
import json import json
import requests import requests
@@ -21,7 +23,7 @@ def database_del_story(sid):
def search_del_story(sid): def search_del_story(sid):
try: try:
r = requests.delete(search.MEILI_URL + 'indexes/qotnews/documents/'+sid, timeout=2) r = requests.delete(settings.MEILI_URL + 'indexes/qotnews/documents/'+sid, timeout=2)
if r.status_code != 202: if r.status_code != 202:
raise Exception('Bad response code ' + str(r.status_code)) raise Exception('Bad response code ' + str(r.status_code))
return r.json() return r.json()
+58
View File
@@ -0,0 +1,58 @@
import time
import json
import logging
import feed
import database
import search
database.init()
def fix_gzip_bug(story_list):
FIX_THRESHOLD = 150
count = 1
for sid in story_list:
try:
sid = sid[0]
story = database.get_story(sid)
full_json = json.loads(story.full_json)
meta_json = json.loads(story.meta_json)
text = full_json.get('text', '')
count = text.count('')
if not count: continue
ratio = count / len(text) * 1000
print('Bad story:', sid, 'Num ?:', count, 'Ratio:', ratio)
if ratio < FIX_THRESHOLD: continue
print('Attempting to fix...')
valid = feed.update_story(meta_json, is_manual=True)
if valid:
database.put_story(meta_json)
search.put_story(meta_json)
print('Success')
else:
print('Story was not valid')
time.sleep(3)
except KeyboardInterrupt:
raise
except BaseException as e:
logging.exception(e)
breakpoint()
if __name__ == '__main__':
num_stories = database.count_stories()
print('Fix {} stories?'.format(num_stories))
print('Press ENTER to continue, ctrl-c to cancel')
input()
story_list = database.get_story_list()
fix_gzip_bug(story_list)
+66
View File
@@ -0,0 +1,66 @@
import logging
logging.basicConfig(
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
level=logging.INFO)
import database
from sqlalchemy import select
import search
import sys
import time
import json
import requests
from bs4 import BeautifulSoup
database.init()
search.init()
BATCH_SIZE = 5000
def put_stories(stories):
return search.meili_api(requests.post, 'indexes/qotnews/documents', stories)
def get_update(update_id):
return search.meili_api(requests.get, 'tasks/{}'.format(update_id))
if __name__ == '__main__':
num_stories = database.count_stories()
print('Reindex {} stories?'.format(num_stories))
print('Press ENTER to continue, ctrl-c to cancel')
input()
story_list = database.get_story_list()
count = 1
while len(story_list):
stories = []
for _ in range(BATCH_SIZE):
try:
sid = story_list.pop()
except IndexError:
break
story = database.get_story(sid)
print('Indexing {}/{} id: {} title: {}'.format(count, num_stories, sid[0], story.title))
story_obj = json.loads(story.meta_json)
if 'text' in story_obj and story_obj['text']:
soup = BeautifulSoup(story_obj['text'], 'html.parser')
story_obj['text'] = soup.get_text()
stories.append(story_obj)
count += 1
res = put_stories(stories)
update_id = res['uid']
print('Waiting for processing', end='')
while get_update(update_id)['status'] != 'succeeded':
time.sleep(0.5)
print('.', end='', flush=True)
print()
print('Done.')
+23
View File
@@ -0,0 +1,23 @@
import time
import requests
def test_search_api():
num_tests = 100
total_time = 0
for i in range(num_tests):
start = time.time()
res = requests.get('http://127.0.0.1:33842/api/search?q=iphone')
res.raise_for_status()
duration = time.time() - start
total_time += duration
avg_time = total_time / num_tests
print('Average search time:', avg_time)
if __name__ == '__main__':
test_search_api()
+40 -65
View File
@@ -4,86 +4,61 @@ logging.basicConfig(
level=logging.DEBUG) level=logging.DEBUG)
import requests import requests
import settings
MEILI_URL = 'http://127.0.0.1:7700/' SEARCH_ENABLED = bool(settings.MEILI_URL)
def create_index(): def meili_api(method, route, json=None, params=None, parse_json=True):
try: try:
json = dict(name='qotnews', uid='qotnews') headers = {'Authorization': 'Bearer ' + settings.MEILI_API_KEY}
r = requests.post(MEILI_URL + 'indexes', json=json, timeout=2) r = method(settings.MEILI_URL + route, json=json, params=params, headers=headers, timeout=4)
if r.status_code != 201: if r.status_code > 299:
raise Exception('Bad response code ' + str(r.status_code)) raise Exception('Bad response code ' + str(r.status_code))
if parse_json:
return r.json() return r.json()
else:
r.encoding = 'utf-8'
return r.text
except KeyboardInterrupt: except KeyboardInterrupt:
raise raise
except BaseException as e: except BaseException as e:
logging.error('Problem creating MeiliSearch index: {}'.format(str(e))) logging.error('Problem with MeiliSearch api route: %s: %s', route, str(e))
return False return False
def update_rankings(): def update_settings():
try: json = {
json = ['typo', 'words', 'proximity', 'attribute', 'desc(date)', 'wordsPosition', 'exactness'] 'rankingRules': ['words', 'typo', 'proximity', 'attribute', 'date:desc', 'exactness'],
r = requests.post(MEILI_URL + 'indexes/qotnews/settings/ranking-rules', json=json, timeout=2) 'searchableAttributes': ['title', 'url', 'author', 'text'],
if r.status_code != 202: 'displayedAttributes': ['id', 'ref', 'source', 'author', 'author_link', 'score', 'date', 'title', 'link', 'url', 'num_comments', 'text'],
raise Exception('Bad response code ' + str(r.status_code)) 'stopWords': ['a', 'an', 'the', 'and', 'or', 'but', 'if', 'in', 'on', 'at', 'by', 'for', 'with', 'to', 'from', 'of', 'is', 'it', 'that', 'this'],
return r.json() }
except KeyboardInterrupt: return meili_api(requests.patch, 'indexes/qotnews/settings', json=json)
raise
except BaseException as e:
logging.error('Problem setting MeiliSearch ranking rules: {}'.format(str(e)))
return False
def update_attributes():
try:
json = ['title', 'url', 'author', 'link', 'id']
r = requests.post(MEILI_URL + 'indexes/qotnews/settings/searchable-attributes', json=json, timeout=2)
if r.status_code != 202:
raise Exception('Bad response code ' + str(r.status_code))
return r.json()
r = requests.delete(MEILI_URL + 'indexes/qotnews/settings/displayed-attributes', timeout=2)
if r.status_code != 202:
raise Exception('Bad response code ' + str(r.status_code))
return r.json()
except KeyboardInterrupt:
raise
except BaseException as e:
logging.error('Problem setting MeiliSearch searchable attributes: {}'.format(str(e)))
return False
def init(): def init():
create_index() if not SEARCH_ENABLED:
update_rankings() logging.info('Search is not enabled, skipping init.')
update_attributes() return
update_settings()
def put_story(story): def put_story(story):
story = story.copy() if not SEARCH_ENABLED: return
story.pop('text', None) return meili_api(requests.post, 'indexes/qotnews/documents', [story])
story.pop('comments', None)
try:
r = requests.post(MEILI_URL + 'indexes/qotnews/documents', json=[story], timeout=2)
if r.status_code != 202:
raise Exception('Bad response code ' + str(r.status_code))
return r.json()
except KeyboardInterrupt:
raise
except BaseException as e:
logging.error('Problem putting MeiliSearch story: {}'.format(str(e)))
return False
def search(q): def search(q, in_article=False):
try: if not SEARCH_ENABLED: return []
params = dict(q=q, limit=250)
r = requests.get(MEILI_URL + 'indexes/qotnews/search', params=params, timeout=2) json = dict(q=q, limit=settings.FEED_LENGTH)
if r.status_code != 200:
raise Exception('Bad response code ' + str(r.status_code)) if True:
return r.json()['hits'] json['attributesToSearchOn'] = ['text']
except KeyboardInterrupt: json['attributesToCrop'] = ['text']
raise json['attributesToRetrieve'] = ['id', 'ref', 'source', 'author', 'author_link', 'score', 'date', 'title', 'link', 'url', 'num_comments']
except BaseException as e: json['cropLength'] = 80
logging.error('Problem searching MeiliSearch: {}'.format(str(e)))
return False r = meili_api(requests.post, 'indexes/qotnews/search', json=json, parse_json=False)
return r
if __name__ == '__main__': if __name__ == '__main__':
create_index() init()
print(search('the')) print(search('facebook'))
+156 -34
View File
@@ -1,7 +1,8 @@
import logging import os, logging
DEBUG = os.environ.get('DEBUG')
logging.basicConfig( logging.basicConfig(
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
level=logging.INFO) level=logging.DEBUG if DEBUG else logging.INFO)
import gevent import gevent
from gevent import monkey from gevent import monkey
@@ -13,22 +14,46 @@ import json
import threading import threading
import traceback import traceback
import time import time
import datetime
import humanize
import urllib.request
from urllib.parse import urlparse, parse_qs from urllib.parse import urlparse, parse_qs
import settings
import database import database
import search import search
import feed import feed
from utils import gen_rand_id from utils import gen_rand_id, NUM_ID_CHARS
from flask import abort, Flask, request, render_template, stream_with_context, Response from flask import abort, Flask, request, render_template, stream_with_context, Response
from werkzeug.exceptions import NotFound from werkzeug.exceptions import NotFound
from flask_cors import CORS from flask_cors import CORS
smallweb_set = set()
def load_smallweb_list():
EXCLUDED = [
'github.com',
]
global smallweb_set
try:
url = 'https://raw.githubusercontent.com/kagisearch/smallweb/refs/heads/main/smallweb.txt'
with urllib.request.urlopen(url, timeout=10) as response:
urls = response.read().decode('utf-8').splitlines()
hosts = {urlparse(u).hostname for u in urls if u and urlparse(u).hostname}
smallweb_set = {h.replace('www.', '') for h in hosts if h not in EXCLUDED}
logging.info('Loaded {} smallweb domains.'.format(len(smallweb_set)))
except Exception as e:
logging.error('Failed to load smallweb list: {}'.format(e))
load_smallweb_list()
database.init() database.init()
search.init() search.init()
FEED_LENGTH = 75
news_index = 0 news_index = 0
ref_list = []
current_item = {}
def new_id(): def new_id():
nid = gen_rand_id() nid = gen_rand_id()
@@ -36,33 +61,100 @@ def new_id():
nid = gen_rand_id() nid = gen_rand_id()
return nid return nid
build_folder = '../webclient/build'
def fromnow(ts):
return humanize.naturaltime(datetime.datetime.fromtimestamp(ts))
build_folder = './build'
flask_app = Flask(__name__, template_folder=build_folder, static_folder=build_folder, static_url_path='') flask_app = Flask(__name__, template_folder=build_folder, static_folder=build_folder, static_url_path='')
flask_app.jinja_env.filters['fromnow'] = fromnow
cors = CORS(flask_app) cors = CORS(flask_app)
@flask_app.route('/api') @flask_app.route('/api')
def api(): def api():
stories = database.get_stories(FEED_LENGTH) skip = request.args.get('skip', 0)
limit = request.args.get('limit', settings.FEED_LENGTH)
if request.args.get('smallweb') == 'true' and smallweb_set:
limit = int(limit)
skip = int(skip)
filtered_stories = []
current_skip = skip
while len(filtered_stories) < limit:
stories_batch = database.get_stories(limit, current_skip)
if not stories_batch:
break
for story_str in stories_batch:
story = json.loads(story_str)
story_url = story.get('url') or story.get('link') or ''
if not story_url:
continue
hostname = urlparse(story_url).hostname
if hostname:
hostname = hostname.replace('www.', '')
if hostname in smallweb_set:
filtered_stories.append(story_str)
if len(filtered_stories) == limit:
break
if len(filtered_stories) == limit:
break
current_skip += limit
stories = filtered_stories
else:
stories = database.get_stories(limit, skip)
# hacky nested json # hacky nested json
res = Response('{"stories":[' + ','.join(stories) + ']}') res = Response('{"stories":[' + ','.join(stories) + ']}')
res.headers['content-type'] = 'application/json' res.headers['content-type'] = 'application/json'
return res return res
@flask_app.route('/api/stats', strict_slashes=False)
def apistats():
stats = {
'news_index': news_index,
'ref_list': ref_list,
'len_ref_list': len(ref_list),
'current_item': current_item,
'total_stories': database.count_stories(),
'id_space': 26**NUM_ID_CHARS,
}
return stats
@flask_app.route('/api/search', strict_slashes=False) @flask_app.route('/api/search', strict_slashes=False)
def apisearch(): def apisearch():
q = request.args.get('q', '') q = request.args.get('q', '')
in_article = request.args.get('article', False)
if len(q) >= 3: if len(q) >= 3:
results = search.search(q) results = search.search(q, in_article)
else: else:
results = [] results = '[]'
return dict(results=results) res = Response(results)
res.headers['content-type'] = 'application/json'
return res
@flask_app.route('/api/submit', methods=['POST'], strict_slashes=False) @flask_app.route('/api/submit', methods=['POST'], strict_slashes=False)
def submit(): def submit():
try: try:
url = request.form['url'] url = request.form['url']
for prefix in ['http://', 'https://']:
if url.lower().startswith(prefix):
break
else: # for
url = 'http://' + url
nid = new_id() nid = new_id()
logging.info('Manual submission: ' + url)
parse = urlparse(url) parse = urlparse(url)
if 'news.ycombinator.com' in parse.hostname: if 'news.ycombinator.com' in parse.hostname:
source = 'hackernews' source = 'hackernews'
@@ -70,6 +162,9 @@ def submit():
elif 'tildes.net' in parse.hostname and '~' in url: elif 'tildes.net' in parse.hostname and '~' in url:
source = 'tildes' source = 'tildes'
ref = parse.path.split('/')[2] ref = parse.path.split('/')[2]
elif 'lobste.rs' in parse.hostname and '/s/' in url:
source = 'lobsters'
ref = parse.path.split('/')[2]
elif 'reddit.com' in parse.hostname and 'comments' in url: elif 'reddit.com' in parse.hostname and 'comments' in url:
source = 'reddit' source = 'reddit'
ref = parse.path.split('/')[4] ref = parse.path.split('/')[4]
@@ -80,6 +175,11 @@ def submit():
ref = url ref = url
existing = database.get_story_by_ref(ref) existing = database.get_story_by_ref(ref)
if existing and DEBUG:
ref = ref + '#' + str(time.time())
existing = False
if existing: if existing:
return {'nid': existing.sid} return {'nid': existing.sid}
else: else:
@@ -88,14 +188,20 @@ def submit():
if valid: if valid:
database.put_story(story) database.put_story(story)
search.put_story(story) search.put_story(story)
if DEBUG:
logging.info('Adding manual ref: {}, id: {}, source: {}'.format(ref, nid, source))
database.put_ref(ref, nid, source)
return {'nid': nid} return {'nid': nid}
else: else:
raise Exception('Invalid article') raise Exception('Invalid article')
except BaseException as e: except Exception as e:
logging.error('Problem with article submission: {} - {}'.format(e.__class__.__name__, str(e))) msg = 'Problem with article submission: {} - {}'.format(e.__class__.__name__, str(e))
logging.error(msg)
print(traceback.format_exc()) print(traceback.format_exc())
abort(400) return {'error': msg.split('\n')[0]}, 400
@flask_app.route('/api/<sid>') @flask_app.route('/api/<sid>')
@@ -112,10 +218,19 @@ def story(sid):
@flask_app.route('/') @flask_app.route('/')
@flask_app.route('/search') @flask_app.route('/search')
def index(): def index():
stories_json = database.get_stories(settings.FEED_LENGTH, 0)
stories = [json.loads(s) for s in stories_json]
for s in stories:
url = urlparse(s.get('url') or s.get('link') or '').hostname or ''
s['hostname'] = url.replace('www.', '')
return render_template('index.html', return render_template('index.html',
title='Feed', title='QotNews',
url='news.t0.vc', url='news.t0.vc',
description='Reddit, Hacker News, and Tildes combined, then pre-rendered in reader mode') description='Hacker News, Reddit, Lobsters, and Tildes articles rendered in reader mode',
robots='index',
stories=stories,
)
@flask_app.route('/<sid>', strict_slashes=False) @flask_app.route('/<sid>', strict_slashes=False)
@flask_app.route('/<sid>/c', strict_slashes=False) @flask_app.route('/<sid>/c', strict_slashes=False)
@@ -125,9 +240,9 @@ def static_story(sid):
except NotFound: except NotFound:
pass pass
story = database.get_story(sid) story_obj = database.get_story(sid)
if not story: return abort(404) if not story_obj: return abort(404)
story = json.loads(story.full_json) story = json.loads(story_obj.full_json)
score = story['score'] score = story['score']
num_comments = story['num_comments'] num_comments = story['num_comments']
@@ -136,18 +251,22 @@ def static_story(sid):
score, 's' if score != 1 else '', score, 's' if score != 1 else '',
num_comments, 's' if num_comments != 1 else '', num_comments, 's' if num_comments != 1 else '',
source) source)
url = urlparse(story['url']).hostname or urlparse(story['link']).hostname or '' url = urlparse(story.get('url') or story.get('link') or '').hostname or ''
url = url.replace('www.', '') url = url.replace('www.', '')
return render_template('index.html', return render_template('index.html',
title=story['title'], title=story['title'] + ' | QotNews',
url=url, url=url,
description=description) description=description,
robots='noindex',
story=story,
show_comments=request.path.endswith('/c'),
)
http_server = WSGIServer(('', 33842), flask_app) http_server = WSGIServer(('0.0.0.0', 33842), flask_app)
def feed_thread(): def feed_thread():
global news_index global news_index, ref_list, current_item
try: try:
while True: while True:
@@ -158,48 +277,51 @@ def feed_thread():
continue continue
try: try:
nid = new_id() nid = new_id()
logging.info('Adding ref: {}, id: {}, source: {}'.format(ref, nid, source))
database.put_ref(ref, nid, source) database.put_ref(ref, nid, source)
logging.info('Added ref ' + ref)
except database.IntegrityError: except database.IntegrityError:
logging.info('Already have ID / ref, skipping.')
continue continue
ref_list = database.get_reflist(FEED_LENGTH) ref_list = database.get_reflist(settings.FEED_LENGTH)
# update current stories # update current stories
if news_index < len(ref_list): if news_index < len(ref_list):
item = ref_list[news_index] current_item = ref_list[news_index]
try: try:
story_json = database.get_story(item['sid']).full_json story_json = database.get_story(current_item['sid']).full_json
story = json.loads(story_json) story = json.loads(story_json)
except AttributeError: except AttributeError:
story = dict(id=item['sid'], ref=item['ref'], source=item['source']) story = dict(id=current_item['sid'], ref=current_item['ref'], source=current_item['source'])
logging.info('Updating {} story: {}, index: {}'.format(story['source'], story['ref'], news_index))
valid = feed.update_story(story) valid = feed.update_story(story)
if valid: if valid:
database.put_story(story) database.put_story(story)
search.put_story(story) search.put_story(story)
else: else:
database.del_ref(item['ref']) database.del_ref(current_item['ref'])
logging.info('Removed ref {}'.format(item['ref'])) logging.info('Removed ref {}'.format(current_item['ref']))
else: else:
logging.info('Skipping index') logging.info('Skipping index: ' + str(news_index))
gevent.sleep(6) gevent.sleep(6)
news_index += 1 news_index += 1
if news_index == FEED_LENGTH: news_index = 0 if news_index == settings.FEED_LENGTH: news_index = 0
except KeyboardInterrupt: except KeyboardInterrupt:
logging.info('Ending feed thread...') logging.info('Ending feed thread...')
except ValueError as e: except ValueError as e:
logging.error('feed_thread error: {} {}'.format(e.__class__.__name__, e)) logging.critical('feed_thread error: {} {}'.format(e.__class__.__name__, e))
http_server.stop() http_server.stop()
print('Starting Feed thread...') logging.info('Starting Feed thread...')
gevent.spawn(feed_thread) gevent.spawn(feed_thread)
print('Starting HTTP thread...') logging.info('Starting HTTP thread...')
try: try:
http_server.serve_forever() http_server.serve_forever()
except KeyboardInterrupt: except KeyboardInterrupt:
+50
View File
@@ -0,0 +1,50 @@
# QotNews settings
# edit this file and save it as settings.py
# Feed Lengths
# Number of top items from each site to pull
# set to 0 to disable that site
FEED_LENGTH = 75
NUM_HACKERNEWS = 15
NUM_LOBSTERS = 10
NUM_REDDIT = 15
NUM_TILDES = 5
# Meilisearch server URL
# Leave blank if not using search
#MEILI_URL = 'http://127.0.0.1:7700/'
MEILI_URL = ''
# Readerserver URL
# Leave blank if not using, but that defeats the whole point
READER_URL = 'http://127.0.0.1:33843/'
# Reddit account info
# leave blank if not using Reddit
REDDIT_CLIENT_ID = ''
REDDIT_CLIENT_SECRET = ''
REDDIT_USER_AGENT = ''
SUBREDDITS = [
'Economics',
'AcademicPhilosophy',
'DepthHub',
'Foodforthought',
'HistoryofIdeas',
'LaymanJournals',
'PhilosophyofScience',
'StateOfTheUnion',
'TheAgora',
'TrueReddit',
'culturalstudies',
'hardscience',
'indepthsports',
'indepthstories',
'ludology',
'neurophilosophy',
'resilientcommunities',
'worldevents',
'StallmanWasRight',
'EverythingScience',
'longevity',
]
+10 -1
View File
@@ -8,8 +8,17 @@ import string
from bleach.sanitizer import Cleaner from bleach.sanitizer import Cleaner
def alert_tanner(message):
try:
logging.info('Alerting Tanner: ' + message)
params = dict(qotnews=message)
requests.get('https://tbot.tanner.vc/message', params=params, timeout=4)
except BaseException as e:
logging.error('Problem alerting Tanner: ' + str(e))
NUM_ID_CHARS = 4
def gen_rand_id(): def gen_rand_id():
return ''.join(random.choice(string.ascii_uppercase) for _ in range(4)) return ''.join(random.choice(string.ascii_uppercase) for _ in range(NUM_ID_CHARS))
def render_md(md): def render_md(md):
if md: if md:
+7 -2
View File
@@ -4,7 +4,7 @@ const port = 33843;
const request = require('request'); const request = require('request');
const JSDOM = require('jsdom').JSDOM; const JSDOM = require('jsdom').JSDOM;
const Readability = require('readability'); const { Readability } = require('readability');
app.use(express.urlencoded({ extended: true })); app.use(express.urlencoded({ extended: true }));
@@ -35,8 +35,13 @@ app.post('/', (req, res) => {
const url = req.body.url; const url = req.body.url;
const requestOptions = { const requestOptions = {
url: url, url: url,
gzip: true,
//headers: {'User-Agent': 'Googlebot/2.1 (+http://www.google.com/bot.html)'}, //headers: {'User-Agent': 'Googlebot/2.1 (+http://www.google.com/bot.html)'},
headers: {'User-Agent': 'Twitterbot/1.0'}, //headers: {'User-Agent': 'Twitterbot/1.0'},
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:77.0) Gecko/20100101 Firefox/77.0',
'X-Forwarded-For': '66.249.66.1',
},
}; };
console.log('Parse request for:', url); console.log('Parse request for:', url);
+292 -311
View File
File diff suppressed because it is too large Load Diff
+7
View File
@@ -0,0 +1,7 @@
# Editor
*.swp
*.swo
meilisearch-linux-amd64
data.ms/
data.ms.old/
+14
View File
@@ -0,0 +1,14 @@
# Qotnews Search Server
Download MeiliSearch with:
```
wget https://github.com/meilisearch/meilisearch/releases/download/v0.27.0/meilisearch-linux-amd64
chmod +x meilisearch-linux-amd64
```
Run with:
```
MEILI_NO_ANALYTICS=true ./meilisearch-linux-amd64
```
+2
View File
@@ -4,12 +4,14 @@
"private": true, "private": true,
"dependencies": { "dependencies": {
"abort-controller": "^3.0.0", "abort-controller": "^3.0.0",
"katex": "^0.16.25",
"localforage": "^1.7.3", "localforage": "^1.7.3",
"moment": "^2.24.0", "moment": "^2.24.0",
"query-string": "^6.8.3", "query-string": "^6.8.3",
"react": "^16.9.0", "react": "^16.9.0",
"react-dom": "^16.9.0", "react-dom": "^16.9.0",
"react-helmet": "^5.2.1", "react-helmet": "^5.2.1",
"react-latex-next": "^3.0.0",
"react-router-dom": "^5.0.1", "react-router-dom": "^5.0.1",
"react-router-hash-link": "^1.2.2", "react-router-hash-link": "^1.2.2",
"react-scripts": "3.1.1" "react-scripts": "3.1.1"
+98 -4
View File
@@ -8,6 +8,8 @@
content="{{ description }}" content="{{ description }}"
/> />
<meta content="{{ url }}" name="og:site_name"> <meta content="{{ url }}" name="og:site_name">
<meta name="robots" content="{{ robots }}">
<link rel="apple-touch-icon" sizes="180x180" href="/apple-touch-icon.png"> <link rel="apple-touch-icon" sizes="180x180" href="/apple-touch-icon.png">
<link rel="icon" type="image/png" sizes="32x32" href="/favicon-32x32.png"> <link rel="icon" type="image/png" sizes="32x32" href="/favicon-32x32.png">
@@ -26,20 +28,112 @@
work correctly both with client-side routing and a non-root public URL. work correctly both with client-side routing and a non-root public URL.
Learn how to configure a non-root public URL by running `npm run build`. Learn how to configure a non-root public URL by running `npm run build`.
--> -->
<title>{{ title }} - QotNews</title> <title>{{ title }}</title>
<style> <style>
html { html {
overflow-y: scroll; overflow-y: scroll;
} }
body { body {
background: #000; background: #eeeeee;
} }
</style> </style>
</head> </head>
<body> <body>
<noscript style="background: white">You need to enable JavaScript to run this app.</noscript> <div id="root">
<div id="root"></div> <div class="container menu">
<p>
<a href="/">QotNews</a>
<br />
<span class="slogan">Hacker News, Reddit, Lobsters, and Tildes articles rendered in reader mode.</span>
</p>
</div>
{% if story %}
<div class="{% if show_comments %}container{% else %}article-container{% endif %}">
<div class="article">
<h1>{{ story.title }}</h1>
{% if show_comments %}
<div class="info">
<a href="/{{ story.id }}">View article</a>
</div>
{% else %}
<div class="info">
Source: <a class="source" href="{{ story.url or story.link }}">{{ url }}</a>
</div>
{% endif %}
<div class="info">
{{ story.score }} points
by <a href="{{ story.author_link }}">{{ story.author }}</a>
{{ story.date | fromnow }}
on <a href="{{ story.link }}">{{ story.source }}</a> |
<a href="/{{ story.id }}/c">
{{ story.num_comments }} comment{{ 's' if story.num_comments != 1 }}
</a>
</div>
{% if not show_comments and story.text %}
<div class="story-text">{{ story.text | safe }}</div>
{% elif show_comments %}
{% macro render_comment(comment, level) %}
<dt></dt>
<dd class="comment{% if level > 0 %} lined{% endif %}">
<div class="info">
<p>
{% if comment.author == story.author %}[OP] {% endif %}{{ comment.author or '[Deleted]' }} | <a href="#{{ comment.author }}{{ comment.date }}" id="{{ comment.author }}{{ comment.date }}">{{ comment.date | fromnow }}</a>
</p>
</div>
<div class="text">{{ (comment.text | safe) if comment.text else '<p>[Empty / deleted comment]</p>' }}</div>
{% if comment.comments %}
<dl>
{% for reply in comment.comments %}
{{ render_comment(reply, level + 1) }}
{% endfor %}
</dl>
{% endif %}
</dd>
{% endmacro %}
<dl class="comments">
{% for comment in story.comments %}{{ render_comment(comment, 0) }}{% endfor %}
</dl>
{% endif %}
</div>
<div class='dot toggleDot'>
<div class='button'>
<a href="/{{ story.id }}{{ '/c' if not show_comments else '' }}">
{{ '' if not show_comments else '' }}
</a>
</div>
</div>
</div>
{% elif stories %}
<div class="container">
{% for story in stories %}
<div class='item'>
<div class='title'>
<a class='link' href='/{{ story.id }}'>
<img class='source-logo' src='/logos/{{ story.source }}.png' alt='{{ story.source }}:' /> {{ story.title }}
</a>
<span class='source'>
(<a class='source' href='{{ story.url or story.link }}'>{{ story.hostname }}</a>)
</span>
</div>
<div class='info'>
{{ story.score }} points
by <a href="{{ story.author_link }}">{{ story.author }}</a>
{{ story.date | fromnow }}
on <a href="{{ story.link }}">{{ story.source }}</a> |
<a class="{{ 'hot' if story.num_comments > 99 else '' }}" href="/{{ story.id }}/c">
{{ story.num_comments }} comment{{ 's' if story.num_comments != 1 }}
</a>
</div>
</div>
{% endfor %}
</div>
{% endif %}
</div>
<!-- <!--
This HTML file is a template. This HTML file is a template.
If you open it directly in the browser, you will see an empty page. If you open it directly in the browser, you will see an empty page.

Before

Width:  |  Height:  |  Size: 538 B

After

Width:  |  Height:  |  Size: 538 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 981 B

Before

Width:  |  Height:  |  Size: 6.5 KiB

After

Width:  |  Height:  |  Size: 6.5 KiB

Before

Width:  |  Height:  |  Size: 5.4 KiB

After

Width:  |  Height:  |  Size: 5.4 KiB

Before

Width:  |  Height:  |  Size: 500 B

After

Width:  |  Height:  |  Size: 500 B

+82 -37
View File
@@ -1,10 +1,12 @@
import React from 'react'; import React, { useState, useEffect, useRef, useCallback } from 'react';
import { BrowserRouter as Router, Route, Link, Switch } from 'react-router-dom'; import { BrowserRouter as Router, Route, Link, Switch } from 'react-router-dom';
import localForage from 'localforage'; import localForage from 'localforage';
import './Style-light.css'; import './Style-light.css';
import './Style-dark.css'; import './Style-dark.css';
import './Style-black.css';
import './Style-red.css';
import './fonts/Fonts.css'; import './fonts/Fonts.css';
import { ForwardDot } from './utils.js'; import { BackwardDot, ForwardDot } from './utils.js';
import Feed from './Feed.js'; import Feed from './Feed.js';
import Article from './Article.js'; import Article from './Article.js';
import Comments from './Comments.js'; import Comments from './Comments.js';
@@ -13,72 +15,115 @@ import Submit from './Submit.js';
import Results from './Results.js'; import Results from './Results.js';
import ScrollToTop from './ScrollToTop.js'; import ScrollToTop from './ScrollToTop.js';
class App extends React.Component { function App() {
constructor(props) { const [theme, setTheme] = useState(localStorage.getItem('theme') || '');
super(props); const cache = useRef({});
const [isFullScreen, setIsFullScreen] = useState(!!document.fullscreenElement);
this.state = { const updateCache = useCallback((key, value) => {
theme: localStorage.getItem('theme') || '', cache.current[key] = value;
}, []);
const light = () => {
setTheme('');
localStorage.setItem('theme', '');
}; };
this.cache = {}; const dark = () => {
} setTheme('dark');
updateCache = (key, value) => {
this.cache[key] = value;
}
light() {
this.setState({ theme: '' });
localStorage.setItem('theme', '');
}
dark() {
this.setState({ theme: 'dark' });
localStorage.setItem('theme', 'dark'); localStorage.setItem('theme', 'dark');
} };
componentDidMount() { const black = () => {
if (!this.cache.length) { setTheme('black');
localStorage.setItem('theme', 'black');
};
const red = () => {
setTheme('red');
localStorage.setItem('theme', 'red');
};
useEffect(() => {
if (Object.keys(cache.current).length === 0) {
localForage.iterate((value, key) => { localForage.iterate((value, key) => {
this.updateCache(key, value); updateCache(key, value);
}); }).then(() => {
console.log('loaded cache from localforage'); console.log('loaded cache from localforage');
});
} }
} }, [updateCache]);
render() { const goFullScreen = () => {
const theme = this.state.theme; if ('wakeLock' in navigator) {
document.body.style.backgroundColor = theme === 'dark' ? '#000' : '#eeeeee'; navigator.wakeLock.request('screen');
}
document.body.requestFullscreen({ navigationUI: 'hide' });
};
const exitFullScreen = () => {
document.exitFullscreen();
};
useEffect(() => {
const onFullScreenChange = () => setIsFullScreen(!!document.fullscreenElement);
document.addEventListener('fullscreenchange', onFullScreenChange);
return () => document.removeEventListener('fullscreenchange', onFullScreenChange);
}, []);
useEffect(() => {
if (theme === 'dark') {
document.body.style.backgroundColor = '#1a1a1a';
} else if (theme === 'black') {
document.body.style.backgroundColor = '#000';
} else if (theme === 'red') {
document.body.style.backgroundColor = '#000';
} else {
document.body.style.backgroundColor = '#eeeeee';
}
}, [theme]);
const fullScreenAvailable = document.fullscreenEnabled ||
document.mozFullscreenEnabled ||
document.webkitFullscreenEnabled ||
document.msFullscreenEnabled;
return ( return (
<div className={theme}> <div className={theme}>
<Router> <Router>
<div className='container menu'> <div className='container menu'>
<p> <p>
<Link to='/'>QotNews - Feed</Link> <Link to='/'>QotNews</Link>
<span className='theme'>Theme: <a href='#' onClick={() => this.light()}>Light</a> - <a href='#' onClick={() => this.dark()}>Dark</a></span>
<span className='theme'><a href='#' onClick={() => light()}>Light</a> - <a href='#' onClick={() => dark()}>Dark</a> - <a href='#' onClick={() => black()}>Black</a> - <a href='#' onClick={() => red()}>Red</a></span>
<br /> <br />
<span className='slogan'>Reddit, Hacker News, and Tildes combined, then pre-rendered in reader mode.</span> <span className='slogan'>Hacker News, Reddit, Lobsters, and Tildes articles rendered in reader mode.</span>
</p> </p>
{fullScreenAvailable &&
<Route path='/(|search)' render={() => !isFullScreen ?
<button className='fullscreen' onClick={() => goFullScreen()}>Enter Fullscreen</button>
:
<button className='fullscreen' onClick={() => exitFullScreen()}>Exit Fullscreen</button>
} />
}
<Route path='/(|search)' component={Search} /> <Route path='/(|search)' component={Search} />
<Route path='/(|search)' component={Submit} /> <Route path='/(|search)' component={Submit} />
</div> </div>
<Route path='/' exact render={(props) => <Feed {...props} updateCache={this.updateCache} />} /> <Route path='/' exact render={(props) => <Feed {...props} updateCache={updateCache} />} />
<Switch> <Switch>
<Route path='/search' component={Results} /> <Route path='/search' component={Results} />
<Route path='/:id' exact render={(props) => <Article {...props} cache={this.cache} />} /> <Route path='/:id' exact render={(props) => <Article {...props} cache={cache.current} />} />
</Switch> </Switch>
<Route path='/:id/c' exact render={(props) => <Comments {...props} cache={this.cache} />} /> <Route path='/:id/c' exact render={(props) => <Comments {...props} cache={cache.current} />} />
<BackwardDot />
<ForwardDot /> <ForwardDot />
<ScrollToTop /> <ScrollToTop />
</Router> </Router>
</div> </div>
); );
}
} }
export default App; export default App;
+168 -48
View File
@@ -1,76 +1,207 @@
import React from 'react'; import React, { useState, useEffect } from 'react';
import { useParams } from 'react-router-dom';
import { Helmet } from 'react-helmet'; import { Helmet } from 'react-helmet';
import localForage from 'localforage'; import localForage from 'localforage';
import { sourceLink, infoLine, ToggleDot } from './utils.js'; import { sourceLink, infoLine, ToggleDot } from './utils.js';
import Latex from 'react-latex-next';
import 'katex/dist/katex.min.css';
class Article extends React.Component { const VOID_ELEMENTS = ['area', 'base', 'br', 'col', 'embed', 'hr', 'img', 'input', 'link', 'meta', 'param', 'source', 'track', 'wbr'];
constructor(props) { const DANGEROUS_TAGS = ['svg', 'math'];
super(props);
const id = this.props.match ? this.props.match.params.id : 'CLOL'; const latexDelimiters = [
const cache = this.props.cache; { left: '$$', right: '$$', display: true },
{ left: '\\[', right: '\\]', display: true },
{ left: '\\(', right: '\\)', display: false }
];
function Article({ cache }) {
const { id } = useParams();
if (id in cache) console.log('cache hit'); if (id in cache) console.log('cache hit');
this.state = { const [story, setStory] = useState(cache[id] || false);
story: cache[id] || false, const [error, setError] = useState('');
error: false, const [pConv, setPConv] = useState([]);
pConv: [], const [copyButtonText, setCopyButtonText] = useState('\ue92c');
};
}
componentDidMount() {
const id = this.props.match ? this.props.match.params.id : 'CLOL';
useEffect(() => {
localForage.getItem(id) localForage.getItem(id)
.then( .then(
(value) => { (value) => {
if (value) { if (value) {
this.setState({ story: value }); setStory(value);
} }
} }
); );
fetch('/api/' + id) fetch('/api/' + id)
.then(res => res.json()) .then(res => {
if (!res.ok) {
throw new Error(`Server responded with ${res.status} ${res.statusText}`);
}
return res.json();
})
.then( .then(
(result) => { (result) => {
this.setState({ story: result.story }); setStory(result.story);
localForage.setItem(id, result.story); localForage.setItem(id, result.story);
}, },
(error) => { (error) => {
this.setState({ error: true }); const errorMessage = `Failed to fetch new article content (ID: ${id}). Your connection may be down or the server might be experiencing issues. ${error.toString()}.`;
setError(errorMessage);
} }
); );
}, [id]);
const copyLink = () => {
navigator.clipboard.writeText(`${story.title}:\n${window.location.href}`).then(() => {
setCopyButtonText('\uea10');
setTimeout(() => setCopyButtonText('\ue92c'), 2000);
}, () => {
setCopyButtonText('\uea0f');
setTimeout(() => setCopyButtonText('\ue92c'), 2000);
});
};
const pConvert = (n) => {
setPConv(prevPConv => [...prevPConv, n]);
};
const isCodeBlock = (v) => {
if (v.localName === 'pre') {
return true;
}
if (v.localName === 'code') {
if (v.closest('p')) {
return false;
}
const parent = v.parentElement;
if (parent) {
const nonWhitespaceChildren = Array.from(parent.childNodes).filter(n => {
return n.nodeType !== Node.TEXT_NODE || n.textContent.trim() !== '';
});
if (nonWhitespaceChildren.length === 1 && nonWhitespaceChildren[0] === v) {
return true;
}
}
}
return false;
};
const renderNodes = (nodes, keyPrefix = '') => {
return Array.from(nodes).map((v, k) => {
const key = `${keyPrefix}${k}`;
if (pConv.includes(key)) {
return (
<React.Fragment key={key}>
{v.textContent.split('\n\n').map((x, i) =>
<p key={i}>{x}</p>
)}
</React.Fragment>
);
} }
pConvert = (n) => { if (v.nodeName === '#text') {
this.setState({ pConv: [...this.state.pConv, n]}); const text = v.data;
if (text.includes('\\[') || text.includes('\\(') || text.includes('$$')) {
return <Latex key={key} delimiters={latexDelimiters}>{text}</Latex>;
} }
render() { // Only wrap top-level text nodes in <p>
const id = this.props.match ? this.props.match.params.id : 'CLOL'; if (keyPrefix === '' && v.data.trim() !== '') {
const story = this.state.story; return <p key={key}>{v.data}</p>;
const error = this.state.error; }
const pConv = this.state.pConv; return v.data;
let nodes = null; }
if (story.text) { if (v.nodeType !== Node.ELEMENT_NODE) {
let domparser = new DOMParser(); return null;
let doc = domparser.parseFromString(story.text, 'text/html'); }
nodes = doc.querySelector('body').children;
if (DANGEROUS_TAGS.includes(v.localName)) {
return <span key={key} dangerouslySetInnerHTML={{ __html: v.outerHTML }} />;
}
const Tag = v.localName;
if (isCodeBlock(v)) {
return (
<React.Fragment key={key}>
<Tag dangerouslySetInnerHTML={{ __html: v.innerHTML }} />
<button onClick={() => pConvert(key)}>Convert Code to Paragraph</button>
</React.Fragment>
);
}
const textContent = v.textContent.trim();
const isMath = (textContent.startsWith('\\(') && textContent.endsWith('\\)')) ||
(textContent.startsWith('\\[') && textContent.endsWith('\\]')) ||
(textContent.startsWith('$$') && textContent.endsWith('$$'));
const props = { key: key };
if (v.hasAttributes()) {
for (const attr of v.attributes) {
const name = attr.name === 'class' ? 'className' : attr.name;
props[name] = attr.value;
}
}
if (isMath) {
let mathContent = v.textContent;
// align environment requires display math mode
if (mathContent.includes('\\begin{align')) {
const trimmed = mathContent.trim();
if (trimmed.startsWith('\\(')) {
// Replace \( and \) with \[ and \] to switch to display mode
const firstParen = mathContent.indexOf('\\(');
const lastParen = mathContent.lastIndexOf('\\)');
mathContent = mathContent.substring(0, firstParen) + '\\[' + mathContent.substring(firstParen + 2, lastParen) + '\\]' + mathContent.substring(lastParen + 2);
}
}
return <Tag {...props}><Latex delimiters={latexDelimiters}>{mathContent}</Latex></Tag>;
}
if (VOID_ELEMENTS.includes(Tag)) {
return <Tag {...props} />;
} }
return (
<Tag {...props}>
{renderNodes(v.childNodes, `${key}-`)}
</Tag>
);
});
};
const nodes = (s) => {
if (s && s.text) {
let div = document.createElement('div');
div.innerHTML = s.text;
return div.childNodes;
}
return null;
};
const storyNodes = nodes(story);
return ( return (
<div className='article-container'> <div className='article-container'>
{error && <p>Connection error?</p>} {error &&
<details style={{marginBottom: '1rem'}}>
<summary>Connection error? Click to expand.</summary>
<p>{error}</p>
{story && <p>Loaded article from cache.</p>}
</details>
}
{story ? {story ?
<div className='article'> <div className='article'>
<Helmet> <Helmet>
<title>{story.title} - QotNews</title> <title>{story.title} | QotNews</title>
<meta name="robots" content="noindex" />
</Helmet> </Helmet>
<h1>{story.title}</h1> <h1>{story.title} <button className='copy-button' onClick={copyLink}>{copyButtonText}</button></h1>
<div className='info'> <div className='info'>
Source: {sourceLink(story)} Source: {sourceLink(story)}
@@ -78,31 +209,20 @@ class Article extends React.Component {
{infoLine(story)} {infoLine(story)}
{nodes ? {storyNodes ?
<div className='story-text'> <div className='story-text'>
{Object.entries(nodes).map(([k, v]) => {renderNodes(storyNodes)}
pConv.includes(k) ?
v.innerHTML.split('\n\n').map(x =>
<p dangerouslySetInnerHTML={{ __html: x }} />
)
:
<>
<v.localName dangerouslySetInnerHTML={{ __html: v.innerHTML }} />
{v.localName == 'pre' && <button onClick={() => this.pConvert(k)}>Convert Code to Paragraph</button>}
</>
)}
</div> </div>
: :
<p>Problem getting article :(</p> <p>Problem getting article :(</p>
} }
</div> </div>
: :
<p>loading...</p> <p>Loading...</p>
} }
<ToggleDot id={id} article={false} /> <ToggleDot id={id} article={false} />
</div> </div>
); );
}
} }
export default Article; export default Article;
+60 -62
View File
@@ -1,83 +1,80 @@
import React from 'react'; import React, { useState, useEffect } from 'react';
import { Link } from 'react-router-dom'; import { Link, useParams } from 'react-router-dom';
import { HashLink } from 'react-router-hash-link'; import { HashLink } from 'react-router-hash-link';
import { Helmet } from 'react-helmet'; import { Helmet } from 'react-helmet';
import moment from 'moment'; import moment from 'moment';
import localForage from 'localforage'; import localForage from 'localforage';
import { infoLine, ToggleDot } from './utils.js'; import { infoLine, ToggleDot } from './utils.js';
class Article extends React.Component { function countComments(c) {
constructor(props) { return c.comments.reduce((sum, x) => sum + countComments(x), 1);
super(props); }
const id = this.props.match.params.id; function Comments({ cache }) {
const cache = this.props.cache; const { id } = useParams();
if (id in cache) console.log('cache hit'); if (id in cache) console.log('cache hit');
this.state = { const [story, setStory] = useState(cache[id] || false);
story: cache[id] || false, const [error, setError] = useState('');
error: false, const [collapsed, setCollapsed] = useState([]);
collapsed: [], const [expanded, setExpanded] = useState([]);
expanded: [],
};
}
componentDidMount() {
const id = this.props.match.params.id;
useEffect(() => {
localForage.getItem(id) localForage.getItem(id)
.then( .then(
(value) => { (value) => {
this.setState({ story: value }); if (value) {
setStory(value);
}
} }
); );
fetch('/api/' + id) fetch('/api/' + id)
.then(res => res.json()) .then(res => {
if (!res.ok) {
throw new Error(`Server responded with ${res.status} ${res.statusText}`);
}
return res.json();
})
.then( .then(
(result) => { (result) => {
this.setState({ story: result.story }, () => { setStory(result.story);
localForage.setItem(id, result.story);
const hash = window.location.hash.substring(1); const hash = window.location.hash.substring(1);
if (hash) { if (hash) {
document.getElementById(hash).scrollIntoView(); setTimeout(() => {
const element = document.getElementById(hash);
if (element) {
element.scrollIntoView();
}
}, 0);
} }
});
localForage.setItem(id, result.story);
}, },
(error) => { (error) => {
this.setState({ error: true }); const errorMessage = `Failed to fetch comments (ID: ${id}). Your connection may be down or the server might be experiencing issues. ${error.toString()}.`;
setError(errorMessage);
} }
); );
} }, [id]);
collapseComment(cid) { const collapseComment = (cid) => {
this.setState(prevState => ({ setCollapsed(prev => [...prev, cid]);
...prevState, setExpanded(prev => prev.filter(x => x !== cid));
collapsed: [...prevState.collapsed, cid], };
expanded: prevState.expanded.filter(x => x !== cid),
}));
}
expandComment(cid) { const expandComment = (cid) => {
this.setState(prevState => ({ setCollapsed(prev => prev.filter(x => x !== cid));
...prevState, setExpanded(prev => [...prev, cid]);
collapsed: prevState.collapsed.filter(x => x !== cid), };
expanded: [...prevState.expanded, cid],
}));
}
countComments(c) { const displayComment = (story, c, level) => {
return c.comments.reduce((sum, x) => sum + this.countComments(x), 1);
}
displayComment(story, c, level) {
const cid = c.author+c.date; const cid = c.author+c.date;
const collapsed = this.state.collapsed.includes(cid); const isCollapsed = collapsed.includes(cid);
const expanded = this.state.expanded.includes(cid); const isExpanded = expanded.includes(cid);
const hidden = collapsed || (level == 4 && !expanded); const hidden = isCollapsed || (level == 4 && !isExpanded);
const hasChildren = c.comments.length !== 0; const hasChildren = c.comments.length !== 0;
return ( return (
@@ -88,34 +85,36 @@ class Article extends React.Component {
{' '} | <HashLink to={'#'+cid} id={cid}>{moment.unix(c.date).fromNow()}</HashLink> {' '} | <HashLink to={'#'+cid} id={cid}>{moment.unix(c.date).fromNow()}</HashLink>
{hidden || hasChildren && {hidden || hasChildren &&
<span className='collapser pointer' onClick={() => this.collapseComment(cid)}></span> <button className='collapser pointer' onClick={() => collapseComment(cid)}></button>
} }
</p> </p>
</div> </div>
<div className='text' dangerouslySetInnerHTML={{ __html: c.text }} /> <div className={isCollapsed ? 'text hidden' : 'text'} dangerouslySetInnerHTML={{ __html: c.text || '<p>[Empty / deleted comment]</p>'}} />
{hidden && hasChildren ? {hidden && hasChildren ?
<div className='comment lined info pointer' onClick={() => this.expandComment(cid)}>[show {this.countComments(c)-1} more]</div> <button className='comment lined info pointer' onClick={() => expandComment(cid)}>[show {countComments(c)-1} more]</button>
: :
c.comments.map(i => this.displayComment(story, i, level + 1)) c.comments.map(i => displayComment(story, i, level + 1))
} }
</div> </div>
); );
} };
render() {
const id = this.props.match.params.id;
const story = this.state.story;
const error = this.state.error;
return ( return (
<div className='container'> <div className='container'>
{error && <p>Connection error?</p>} {error &&
<details style={{marginBottom: '1rem'}}>
<summary>Connection error? Click to expand.</summary>
<p>{error}</p>
{story && <p>Loaded comments from cache.</p>}
</details>
}
{story ? {story ?
<div className='article'> <div className='article'>
<Helmet> <Helmet>
<title>{story.title} - QotNews Comments</title> <title>{story.title} | QotNews</title>
<meta name="robots" content="noindex" />
</Helmet> </Helmet>
<h1>{story.title}</h1> <h1>{story.title}</h1>
@@ -127,7 +126,7 @@ class Article extends React.Component {
{infoLine(story)} {infoLine(story)}
<div className='comments'> <div className='comments'>
{story.comments.map(c => this.displayComment(story, c, 0))} {story.comments.map(c => displayComment(story, c, 0))}
</div> </div>
</div> </div>
: :
@@ -136,7 +135,6 @@ class Article extends React.Component {
<ToggleDot id={id} article={true} /> <ToggleDot id={id} article={true} />
</div> </div>
); );
}
} }
export default Article; export default Comments;
+115 -40
View File
@@ -1,71 +1,145 @@
import React from 'react'; import React, { useState, useEffect } from 'react';
import { Link } from 'react-router-dom'; import { Link } from 'react-router-dom';
import { Helmet } from 'react-helmet'; import { Helmet } from 'react-helmet';
import localForage from 'localforage'; import localForage from 'localforage';
import { sourceLink, infoLine, logos } from './utils.js'; import { sourceLink, infoLine, logos } from './utils.js';
class Feed extends React.Component { function Feed({ updateCache }) {
constructor(props) { const [stories, setStories] = useState(() => JSON.parse(localStorage.getItem('stories')) || false);
super(props); const [error, setError] = useState('');
const [loadingStatus, setLoadingStatus] = useState(null);
const [filterSmallweb, setFilterSmallweb] = useState(() => localStorage.getItem('filterSmallweb') === 'true');
this.state = { const handleFilterChange = e => {
stories: JSON.parse(localStorage.getItem('stories')) || false, const isChecked = e.target.checked;
error: false, setStories(false);
setFilterSmallweb(isChecked);
localStorage.setItem('filterSmallweb', isChecked);
}; };
}
componentDidMount() { useEffect(() => {
fetch('/api') const controller = new AbortController();
.then(res => res.json())
fetch(filterSmallweb ? '/api?smallweb=true' : '/api', { signal: controller.signal })
.then(res => {
if (!res.ok) {
throw new Error(`Server responded with ${res.status} ${res.statusText}`);
}
return res.json();
})
.then( .then(
(result) => { async (result) => {
const updated = !this.state.stories || this.state.stories[0].id !== result.stories[0].id; const newApiStories = result.stories;
console.log('updated:', updated);
this.setState({ stories: result.stories }); const updated = !stories || !stories.length || stories[0].id !== newApiStories[0].id;
localStorage.setItem('stories', JSON.stringify(result.stories)); console.log('New stories available:', updated);
if (updated) { if (!updated) return;
localForage.clear();
result.stories.forEach((x, i) => { setLoadingStatus({ current: 0, total: newApiStories.length });
fetch('/api/' + x.id)
.then(res => res.json()) let currentStories = Array.isArray(stories) ? [...stories] : [];
.then(result => { let preloadedCount = 0;
localForage.setItem(x.id, result.story)
.then(console.log('preloaded', x.id, x.title)); for (const [index, newStory] of newApiStories.entries()) {
this.props.updateCache(x.id, result.story); if (controller.signal.aborted) {
}, error => {} break;
);
});
} }
try {
const storyFetchController = new AbortController();
const timeoutId = setTimeout(() => storyFetchController.abort(), 10000); // 10-second timeout
const storyRes = await fetch('/api/' + newStory.id, { signal: storyFetchController.signal });
clearTimeout(timeoutId);
if (!storyRes.ok) {
throw new Error(`Server responded with ${storyRes.status} ${storyRes.statusText}`);
}
const storyResult = await storyRes.json();
const fullStory = storyResult.story;
await localForage.setItem(fullStory.id, fullStory);
console.log('Preloaded story:', fullStory.id, fullStory.title);
updateCache(fullStory.id, fullStory);
preloadedCount++;
setLoadingStatus({ current: preloadedCount, total: newApiStories.length });
const existingStoryIndex = currentStories.findIndex(s => s.id === newStory.id);
if (existingStoryIndex > -1) {
currentStories.splice(existingStoryIndex, 1);
}
currentStories.splice(index, 0, newStory);
localStorage.setItem('stories', JSON.stringify(currentStories));
setStories(currentStories);
} catch (error) {
let errorMessage;
if (error.name === 'AbortError') {
errorMessage = `The request to fetch story '${newStory.title}' (${newStory.id}) timed out after 10 seconds. Your connection may be unstable. (${preloadedCount} / ${newApiStories.length} stories preloaded)`;
console.log('Fetch timed out for story:', newStory.id);
} else {
errorMessage = `An error occurred while fetching story '${newStory.title}' (ID: ${newStory.id}): ${error.toString()}. (${preloadedCount} / ${newApiStories.length} stories preloaded)`;
console.log('Fetch failed for story:', newStory.id, error);
}
setError(errorMessage);
break;
}
}
const finalStories = currentStories.slice(0, newApiStories.length);
const removedStories = currentStories.slice(newApiStories.length);
for (const story of removedStories) {
console.log('Removed story:', story.id, story.title);
localForage.removeItem(story.id);
}
localStorage.setItem('stories', JSON.stringify(finalStories));
setStories(finalStories);
setLoadingStatus(null);
}, },
(error) => { (error) => {
this.setState({ error: true }); if (error.name === 'AbortError') {
console.log('Feed fetch aborted.');
return;
}
const errorMessage = `Failed to fetch the main story list from the API. Your connection may be down or the server might be experiencing issues. ${error.toString()}.`;
setError(errorMessage);
} }
); );
}
render() { return () => controller.abort();
const stories = this.state.stories; }, [updateCache, filterSmallweb]);
const error = this.state.error;
return ( return (
<div className='container'> <div className='container'>
<Helmet> <Helmet>
<title>Feed - QotNews</title> <title>QotNews</title>
<meta name="robots" content="index" />
</Helmet> </Helmet>
{error && <p>Connection error?</p>}
<div style={{marginBottom: '1rem'}}>
<input type="checkbox" id="filter-smallweb" className="checkbox" checked={filterSmallweb} onChange={handleFilterChange} />
<label htmlFor="filter-smallweb">Only Smallweb</label>
</div>
{error &&
<details style={{marginBottom: '1rem'}}>
<summary>Connection error? Click to expand.</summary>
<p>{error}</p>
{stories && <p>Loaded feed from cache.</p>}
</details>
}
{stories ? {stories ?
<div> <div>
{stories.map((x, i) => {stories.map(x =>
<div className='item' key={i}> <div className='item' key={x.id}>
<div className='title'> <div className='title'>
<Link className='link' to={'/' + x.id}> <Link className='link' to={'/' + x.id}>
<img className='source-logo' src={logos[x.source]} alt='source logo' /> {x.title} <img className='source-logo' src={logos[x.source]} alt='source logo' /> {x.title}
</Link> </Link>
<span className='source'> <span className='source'>
&#8203;({sourceLink(x)}) ({sourceLink(x)})
</span> </span>
</div> </div>
@@ -74,11 +148,12 @@ class Feed extends React.Component {
)} )}
</div> </div>
: :
<p>loading...</p> <p>Loading...</p>
} }
{loadingStatus && <p>Preloading stories {loadingStatus.current} / {loadingStatus.total}...</p>}
</div> </div>
); );
}
} }
export default Feed; export default Feed;
+43 -39
View File
@@ -1,82 +1,87 @@
import React from 'react'; import React, { useState, useEffect } from 'react';
import { Link } from 'react-router-dom'; import { Link, useLocation, useHistory } from 'react-router-dom';
import { Helmet } from 'react-helmet'; import { Helmet } from 'react-helmet';
import queryString from 'query-string';
import { sourceLink, infoLine, logos } from './utils.js'; import { sourceLink, infoLine, logos } from './utils.js';
import AbortController from 'abort-controller'; import AbortController from 'abort-controller';
class Results extends React.Component { function Results() {
constructor(props) { const [stories, setStories] = useState(false);
super(props); const [error, setError] = useState(false);
const location = useLocation();
const history = useHistory();
this.state = { const handleFilterChange = e => {
stories: false, const isChecked = e.target.checked;
error: false,
const currentQuery = queryString.parse(location.search);
if (isChecked) {
currentQuery.article = 'true';
} else {
delete currentQuery.article;
}
history.push('/search?' + queryString.stringify(currentQuery));
}; };
this.controller = null; useEffect(() => {
} const controller = new AbortController();
const signal = controller.signal;
performSearch = () => { const search = location.search;
if (this.controller) {
this.controller.abort();
}
this.controller = new AbortController();
const signal = this.controller.signal;
const search = this.props.location.search;
fetch('/api/search' + search, { method: 'get', signal: signal }) fetch('/api/search' + search, { method: 'get', signal: signal })
.then(res => res.json()) .then(res => res.json())
.then( .then(
(result) => { (result) => {
this.setState({ stories: result.results }); setStories(result.hits);
}, },
(error) => { (error) => {
if (error.message !== 'The operation was aborted. ') { if (error.message !== 'The operation was aborted. ') {
this.setState({ error: true }); setError(true);
} }
} }
); );
}
componentDidMount() { return () => {
this.performSearch(); controller.abort();
} };
}, [location.search]);
componentDidUpdate(prevProps) { const searchInArticle = queryString.parse(location.search).article === 'true';
if (this.props.location.search !== prevProps.location.search) {
this.performSearch();
}
}
render() {
const stories = this.state.stories;
const error = this.state.error;
return ( return (
<div className='container'> <div className='container'>
<Helmet> <Helmet>
<title>Feed - QotNews</title> <title>Search Results | QotNews</title>
</Helmet> </Helmet>
<div style={{marginBottom: '1rem'}}>
<input type="checkbox" id="search-in-article" className="checkbox" checked={searchInArticle} onChange={handleFilterChange} />
<label htmlFor="search-in-article">Search in article</label>
</div>
{error && <p>Connection error?</p>} {error && <p>Connection error?</p>}
{stories ? {stories ?
<> <>
<p>Search results:</p> <p>Search results:</p>
<div className='comment lined'> <div className='comment lined'>
{stories.length ? {stories.length ?
stories.map((x, i) => stories.map(x =>
<div className='item' key={i}> <div className='item' key={x.id}>
<div className='title'> <div className='title'>
<Link className='link' to={'/' + x.id}> <Link className='link' to={'/' + x.id}>
<img className='source-logo' src={logos[x.source]} alt='source logo' /> {x.title} <img className='source-logo' src={logos[x.source]} alt='source logo' /> {x.title}
</Link> </Link>
<span className='source'> <span className='source'>
&#8203;({sourceLink(x)}) ({sourceLink(x)})
</span> </span>
</div> </div>
{infoLine(x)} {infoLine(x)}
{!!x?._formatted &&
<p>{x._formatted.text.replace(/\n/g, ' ')}</p>
}
</div> </div>
) )
: :
@@ -89,7 +94,6 @@ class Results extends React.Component {
} }
</div> </div>
); );
}
} }
export default Results; export default Results;
+1
View File
@@ -15,6 +15,7 @@ class ScrollToTop extends React.Component {
} }
window.scrollTo(0, 0); window.scrollTo(0, 0);
document.body.scrollTop = 0;
} }
render() { render() {
+28 -29
View File
@@ -1,51 +1,50 @@
import React, { Component } from 'react'; import React, { useState, useRef } from 'react';
import { withRouter } from 'react-router-dom'; import { useHistory, useLocation } from 'react-router-dom';
import queryString from 'query-string'; import queryString from 'query-string';
const getSearch = props => queryString.parse(props.location.search).q; const getSearch = location => queryString.parse(location.search).q || '';
class Search extends Component { function Search() {
constructor(props) { const history = useHistory();
super(props); const location = useLocation();
this.state = {search: getSearch(this.props)}; const [search, setSearch] = useState(getSearch(location));
this.inputRef = React.createRef(); const inputRef = useRef(null);
}
searchArticles = (event) => { const searchArticles = (event) => {
const search = event.target.value; const newSearch = event.target.value;
this.setState({search: search}); setSearch(newSearch);
if (search.length >= 3) { if (newSearch.length >= 3) {
const searchQuery = queryString.stringify({ 'q': search }); const currentQuery = queryString.parse(location.search);
this.props.history.replace('/search?' + searchQuery); currentQuery.q = newSearch;
const searchQuery = queryString.stringify(currentQuery);
history.replace('/search?' + searchQuery);
} else { } else {
this.props.history.replace('/'); history.replace('/');
} }
} }
searchAgain = (event) => { const searchAgain = (event) => {
event.preventDefault(); event.preventDefault();
const searchString = queryString.stringify({ 'q': event.target[0].value }); const currentQuery = queryString.parse(location.search);
this.props.history.push('/search?' + searchString); currentQuery.q = event.target[0].value;
this.inputRef.current.blur(); const searchString = queryString.stringify(currentQuery);
history.push('/search?' + searchString);
inputRef.current.blur();
} }
render() {
const search = this.state.search;
return ( return (
<span className='search'> <span className='search'>
<form onSubmit={this.searchAgain}> <form onSubmit={searchAgain}>
<input <input
placeholder='Search... (fixed)' placeholder='Search...'
value={search} value={search}
onChange={this.searchArticles} onChange={searchArticles}
ref={this.inputRef} ref={inputRef}
/> />
</form> </form>
</span> </span>
); );
}
} }
export default withRouter(Search); export default Search;
+77
View File
@@ -0,0 +1,77 @@
.black {
color: #ddd;
}
.black a {
color: #ddd;
}
.black input {
color: #ddd;
border: 1px solid #828282;
}
.black .menu button,
.black .story-text button {
background-color: #444444;
border-color: #bbb;
color: #ddd;
}
.black .item {
color: #828282;
}
.black .item .source-logo {
filter: grayscale(1);
}
.black .item a {
color: #828282;
}
.black .item a.link {
color: #ddd;
}
.black .item a.link:visited {
color: #828282;
}
.black .item .info a.hot {
color: #cccccc;
}
.black .article a {
border-bottom: 1px solid #aaaaaa;
}
.black .article u {
border-bottom: 1px solid #aaaaaa;
text-decoration: none;
}
.black .story-text video,
.black .story-text img {
filter: brightness(50%);
}
.black .article .info {
color: #828282;
}
.black .article .info a {
border-bottom: none;
color: #828282;
}
.black .comment.lined {
border-left: 1px solid #444444;
}
.black .checkbox:checked + label::after {
border-color: #ddd;
}
.black .copy-button {
color: #828282;
}
+16 -4
View File
@@ -11,12 +11,15 @@
border: 1px solid #828282; border: 1px solid #828282;
} }
.dark .item { .dark .menu button,
color: #828282; .dark .story-text button {
background-color: #444444;
border-color: #bbb;
color: #ddd;
} }
.dark .item .source-logo { .dark .item {
filter: grayscale(1); color: #828282;
} }
.dark .item a { .dark .item a {
@@ -43,6 +46,7 @@
text-decoration: none; text-decoration: none;
} }
.dark .story-text video,
.dark .story-text img { .dark .story-text img {
filter: brightness(50%); filter: brightness(50%);
} }
@@ -59,3 +63,11 @@
.dark .comment.lined { .dark .comment.lined {
border-left: 1px solid #444444; border-left: 1px solid #444444;
} }
.dark .checkbox:checked + label::after {
border-color: #ddd;
}
.dark .copy-button {
color: #828282;
}
+170 -17
View File
@@ -2,9 +2,30 @@ body {
text-rendering: optimizeLegibility; text-rendering: optimizeLegibility;
font: 1rem/1.3 sans-serif; font: 1rem/1.3 sans-serif;
color: #000000; color: #000000;
margin-bottom: 100vh;
word-break: break-word; word-break: break-word;
font-kerning: normal; font-kerning: normal;
margin: 0;
}
::backdrop {
background-color: rgba(0,0,0,0);
}
body:fullscreen {
overflow-y: scroll !important;
}
body:-ms-fullscreen {
overflow-y: scroll !important;
}
body:-webkit-full-screen {
overflow-y: scroll !important;
}
body:-moz-full-screen {
overflow-y: scroll !important;
}
#root {
margin: 8px 8px 100vh 8px !important;
} }
a { a {
@@ -22,10 +43,21 @@ input {
border-radius: 4px; border-radius: 4px;
} }
.fullscreen {
margin: 0.25rem;
padding: 0.25rem;
}
pre { pre {
overflow: auto; overflow: auto;
} }
.comments pre {
overflow: auto;
white-space: pre-wrap;
overflow-wrap: break-word;
}
.container { .container {
margin: 1rem auto; margin: 1rem auto;
max-width: 64rem; max-width: 64rem;
@@ -94,6 +126,13 @@ span.source {
border-bottom: 1px solid #222222; border-bottom: 1px solid #222222;
} }
.article-title {
display: flex;
align-items: center;
margin-top: 0.67em;
margin-bottom: 0.67em;
}
.article h1 { .article h1 {
font-size: 1.6rem; font-size: 1.6rem;
} }
@@ -117,6 +156,11 @@ span.source {
margin: 0; margin: 0;
} }
.article table {
width: 100%;
table-layout: fixed;
}
.article iframe { .article iframe {
display: none; display: none;
} }
@@ -145,6 +189,13 @@ span.source {
.comments { .comments {
margin-left: -1.25rem; margin-left: -1.25rem;
margin-top: 0;
margin-bottom: 0;
padding: 0;
}
.comments dl, .comments dd {
margin: 0;
} }
.comment { .comment {
@@ -157,18 +208,73 @@ span.source {
.comment .text { .comment .text {
margin-top: -0.5rem; margin-top: -0.5rem;
margin-bottom: 1rem;
} }
.toggleDot { .comment .text > * {
margin-bottom: 0;
}
.comment .text.hidden > p {
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
display: none;
color: #828282;
}
.comment .text.hidden > p:first-child {
display: block;
}
.comment .collapser {
padding-left: 0.5rem;
padding-right: 1.5rem;
}
button.collapser {
background: transparent;
border: none;
margin: 0;
padding-top: 0;
padding-bottom: 0;
font: inherit;
color: inherit;
}
button.comment {
background: transparent;
border-top: none;
border-right: none;
border-bottom: none;
margin: 0;
padding-top: 0;
padding-right: 0;
padding-bottom: 0;
font: inherit;
color: inherit;
text-align: left;
width: 100%;
}
.comment .pointer {
cursor: pointer;
}
.dot {
cursor: pointer;
position: fixed; position: fixed;
bottom: 1rem;
left: 1rem;
height: 3rem; height: 3rem;
width: 3rem; width: 3rem;
background-color: #828282; background-color: #828282;
border-radius: 50%; border-radius: 50%;
} }
.toggleDot {
bottom: 1rem;
left: 1rem;
}
.toggleDot .button { .toggleDot .button {
font: 2rem/1 'icomoon'; font: 2rem/1 'icomoon';
position: relative; position: relative;
@@ -177,32 +283,79 @@ span.source {
} }
.forwardDot { .forwardDot {
cursor: pointer;
position: fixed;
bottom: 1rem; bottom: 1rem;
right: 1rem; right: 1rem;
height: 3rem;
width: 3rem;
background-color: #828282;
border-radius: 50%;
} }
.forwardDot .button { .forwardDot .button {
font: 2.5rem/1 'icomoon'; font: 2rem/1 'icomoon';
position: relative; position: relative;
top: 0.25rem; top: 0.5rem;
left: 0.3rem; left: 0.5rem;
}
.backwardDot {
bottom: 1rem;
right: 5rem;
}
.backwardDot .button {
font: 2rem/1 'icomoon';
position: relative;
top: 0.5rem;
left: 0.5rem;
} }
.search form { .search form {
display: inline; display: inline;
} }
.collapser { .copy-button {
padding-left: 0.5rem; font: 1.5rem/1 'icomoon2';
padding-right: 1.5rem; color: #828282;
background: transparent;
border: none;
cursor: pointer;
vertical-align: middle;
} }
.pointer { .checkbox {
-webkit-appearance: none;
appearance: none;
position: absolute;
opacity: 0;
cursor: pointer; cursor: pointer;
height: 0;
width: 0;
}
.checkbox + label {
position: relative;
cursor: pointer;
padding-left: 1.75rem;
user-select: none;
}
.checkbox + label::before {
content: '';
position: absolute;
left: 0;
top: 0.1em;
width: 1rem;
height: 1rem;
border: 1px solid #828282;
background-color: transparent;
border-radius: 3px;
}
.checkbox:checked + label::after {
content: "";
position: absolute;
left: 0.35rem;
top: 0.2em;
width: 0.3rem;
height: 0.6rem;
border: solid #000;
border-width: 0 2px 2px 0;
transform: rotate(45deg);
} }
+95
View File
@@ -0,0 +1,95 @@
.red {
color: #b00;
scrollbar-color: #b00 #440000;
}
.red a {
color: #b00;
}
.red input {
color: #b00;
border: 1px solid #690000;
}
.red input::placeholder {
color: #690000;
}
.red hr {
background-color: #690000;
}
.red .menu button,
.red .story-text button {
background-color: #440000;
border-color: #b00;
color: #b00;
}
.red .item,
.red .slogan {
color: #690000;
}
.red .item .source-logo {
display: none;
}
.red .item a {
color: #690000;
}
.red .item a.link {
color: #b00;
}
.red .item a.link:visited {
color: #690000;
}
.red .item .info a.hot {
color: #cc0000;
}
.red .article a {
border-bottom: 1px solid #aa0000;
}
.red .article u {
border-bottom: 1px solid #aa0000;
text-decoration: none;
}
.red .story-text video,
.red .story-text img {
filter: grayscale(100%) brightness(20%) sepia(100%) hue-rotate(-50deg) saturate(600%) contrast(0.8);
}
.red .article .info {
color: #690000;
}
.red .article .info a {
border-bottom: none;
color: #690000;
}
.red .comment.lined {
border-left: 1px solid #440000;
}
.red .dot {
background-color: #440000;
}
.red .checkbox + label::before {
border: 1px solid #690000;
}
.red .checkbox:checked + label::after {
border-color: #aa0000;
}
.red .copy-button {
color: #690000;
}
+33 -34
View File
@@ -1,54 +1,53 @@
import React, { Component } from 'react'; import React, { useState, useRef } from 'react';
import { withRouter } from 'react-router-dom'; import { useHistory } from 'react-router-dom';
class Submit extends Component { function Submit() {
constructor(props) { const [progress, setProgress] = useState(null);
super(props); const inputRef = useRef(null);
const history = useHistory();
this.state = { const submitArticle = async (event) => {
progress: null,
};
this.inputRef = React.createRef();
}
submitArticle = (event) => {
event.preventDefault(); event.preventDefault();
const url = event.target[0].value; const url = event.target[0].value;
this.inputRef.current.blur(); inputRef.current.blur();
this.setState({ progress: 'Submitting...' }); setProgress('Submitting...');
let data = new FormData(); let data = new FormData();
data.append('url', url); data.append('url', url);
fetch('/api/submit', { method: 'POST', body: data }) try {
.then(res => res.json()) const res = await fetch('/api/submit', { method: 'POST', body: data });
.then(
(result) => {
this.props.history.replace('/' + result.nid);
},
(error) => {
this.setState({ progress: 'Error' });
}
);
}
render() { if (res.ok) {
const progress = this.state.progress; const result = await res.json();
history.replace('/' + result.nid);
} else {
let errorData;
try {
errorData = await res.json();
} catch (jsonError) {
// Not a JSON error from our API, so it's a server issue
throw new Error(`Server responded with ${res.status} ${res.statusText}`);
}
setProgress(errorData.error || 'An unknown error occurred.');
}
} catch (error) {
setProgress(`Error: ${error.toString()}`);
}
}
return ( return (
<span className='search'> <span className='search'>
<form onSubmit={this.submitArticle}> <form onSubmit={submitArticle}>
<input <input
placeholder='Submit Article' placeholder='Submit URL'
ref={this.inputRef} ref={inputRef}
/> />
</form> </form>
{progress ? progress : ''} {progress && <p>{progress}</p>}
</span> </span>
); );
}
} }
export default withRouter(Submit); export default Submit;
+5
View File
@@ -26,3 +26,8 @@
font-family: 'Icomoon'; font-family: 'Icomoon';
src: url('icomoon.ttf') format('truetype'); src: url('icomoon.ttf') format('truetype');
} }
@font-face {
font-family: 'Icomoon2';
src: url('icomoon2.ttf') format('truetype');
}
Binary file not shown.
Binary file not shown.
+1 -1
View File
@@ -8,4 +8,4 @@ ReactDOM.render(<App />, document.getElementById('root'));
// If you want your app to work offline and load faster, you can change // If you want your app to work offline and load faster, you can change
// // unregister() to register() below. Note this comes with some pitfalls. // // unregister() to register() below. Note this comes with some pitfalls.
// // Learn more about service workers: https://bit.ly/CRA-PWA // // Learn more about service workers: https://bit.ly/CRA-PWA
serviceWorker.register(); serviceWorker.unregister();
+29 -17
View File
File diff suppressed because one or more lines are too long
+4850 -3318
View File
File diff suppressed because it is too large Load Diff