In the previous parts we completed building the API to list the news items stored in our database, and a scraper to collect news items from two news portals: Setopati and Ekantipur. If your code up to now is not working as expected, you can verify your code against the working code at the GitHub repository. The link to the GitHub repository is at the bottom of this blogpost.    

In this part we will build the UI for our application in React.

Before we start building the frontend let’s check our API first. Switch to the root directory (where manage.py is located), and start Django server: 

python manage.py runserver

Next, open the API link in your browser. If your database contains news items, you’ll see them displayed as a list. However, there’s a minor issue: the news items are ordered by the time they were stored in the database. As a result, news from Ekantipur will appear on the earlier pages, while news from Setopati will show up on the later pages.

When we run our crawler, the SetopatiSpider runs first, followed by the EkantipurSpider. This means news items from Ekantipur will have a later created_at time than news from Setopati. As our API lists news items in descending order of created_at, all news items from Ekantipur will appear before the ones from Setopati. This isn’t ideal for us because we don’t want to favor one portal over another. Ideally, we want to display an equal number of news items from both sources on each page. Here is how we will do achieve that:

Our API has a page size of 24 as defined in settings.py file under REST_FRAMEWORK settings. So, we will try to fetch 12 news items from Setopati and 12 from Ekantipur on each page. This will ensure that no news portal is prioritized. Update views.py inside the api directory.

news-aggregator/api/views.py

Here, we added a get_queryset method to our ListNews class. The get_queryset method fetches all the news items from both sources, ordered by the most recent. Then we used zip_longest to combine the queryset by alternating between them. This will keep news items in this order: news item from Setopati, followed by news item from Ekantipur, followed by news item from Setopati, and so on. If one source has fewer items, the missing entries are filled with None.

Then we used the from_iterable function which takes an iterable of iterables(the pairs generated by zip_longest) and flattens it into a single iterable. Iterable is an object which can be looped over with the help of a for loop. The result from from_iterable is a flattened iterable. To use it as a list, we wrapped it by list().

Then we removed any None values added by zip_longest when one source has fewer elements. 

After that we defined the list method. This method takes the list returned from the get_queryset method and applies pagination to it (divides the list into 24 news items on each page). 

Now if you try out the API, it will have an equal number of news items from both sources on each page. 

Building the Frontend
Setting up react

Install node js from https://nodejs.org/en 

In the news-aggregator directory, create a new directory:

frontend

Switch to the directory from your terminal: 

cd frontend 

Then create a react app there: 

npx create-react-app news-site

Note:

If you have installed npm below 5.2.0 then npx is not installed in your system.

You can check the version of the npm by running the following command: 

npm -v

You can check npx is installed or not by running the following command:

npx -v

If npx is not installed you can install that separately by running the below command.

npm install -g npx

We are going to be using Tailwind CSS to style our UI. So we will configure Tailwind CSS first.

Go inside the news-site directory in your terminal and run:

npm install -D tailwindcss

 Then run:

npx tailwindcss init 

This should create tailwind.config.js inside the news-site directory.

Replace everything inside with:

This will configure Tailwind for our UI. The content property specifies the file paths where Tailwind should look for class names. It includes all JavaScript, TypeScript, and JSX/TSX files within the src directory.

In the theme section, we extend the default Tailwind CSS configuration and define custom breakpoints under the screens property to control responsiveness. These breakpoints will help us create a responsive UI by applying different styles based on the screen size. The plugins array is currently empty but can be used to add additional functionality or third-party Tailwind plugins if needed.

Open index.css inside the src directory and replace everything with:

 Delete the following files inside the src directory:

 App.css

App.test.js

reportWebVitals.js

setupTests.js

logo.svg

 Open the index.js file and remove <React.StrictMode> and </React.StrictMode> leaving only <App /> 

 Also remove: reportWebVitals(); and its import.

 Now inside the src directory create three directories: components, features, and pages

 We will store components inside the components, pages inside the pages and API handling inside the features directory.

Let’s start by fetching data from our API using axios. Install axios by running the command: 

npm install axios

Create a file newsApi.js inside the features directory. Add the following inside the newsApi.js:

news-aggregator/frontend/news-site/src/features/newsApi.js

Here, we start by importing the axios library, which allows us to make HTTP requests, and define the base URL of our API as API_BASE_URL.

We then create an asynchronous function called fetchAllNews that accepts a page parameter, defaulting to 1 if no value is provided. This function sends a GET request to the API endpoint, appending the page parameter to retrieve paginated news data.

Inside the function, we use axios.get to make the request to the API, dynamically including the page parameter in the URL. If the request is successful, we return the data property from the API response, which contains the fetched news items. If an error occurs during the request, we catch it and re-throw the error to ensure it can be handled by the calling code.

Next, we will create our component to show news items. We will just create an outline of the component for now and add the functionality later. We will have 24 news items on each page. We will display 24 news items in 4 columns of 6 rows per page. Create NewsShow.js inside the components directory and add the following:

news-aggregator-1/frontend/news-site/src/components/NewsShow.js

Here we created a component NewsShow which returns 24 <div> that will be arranged in 6 rows with 4 columns each.Tailwind has prebuilt styles which we can use by adding them to the class name of elements. 

The class p-3 adds padding of 12px around an element, the class text-xl sets text size to 20px and so on.

Similarly, the class grid grid-cols-4 gap-3 arranges child elements in a grid layout of 4 columns with a gap of 0.75rem(12px) between each column. 

The class sm:grid-cols-1 specifies to use a grid layout of 1 column in case the small screen size (up to 640px). This will make our page responsive to changes in screen size. We have already specified the screen size for sm in tailwind.config.js file.

You can learn about styling in Tailwind CSS here:

https://tailwindcss.com/ 

Here we have filled our page with dummy news items, later we will replace them with actual news items stored in our database.

Now let’s create the page which will use the above component to display news items. Create NewsPage.js inside the pages directory. 

news-aggregator/frontend/news-site/src/pages/NewsPage.js

Here, we imported our NewsShow component and placed it below a <header> element.

Now we will test our page. Open App.js inside the src directory and replace everything inside with with:

news-aggregator/frontend/news-site/src/App.js

After that in your terminal, run:

npm start

You should get a page with 24 dummy news items in your local address (generally http://localhost:3000/) where the React app is running.

Now, let’s include the functionality to show the actual news items stored in our database. For that, we will modify NewsPage.js

In this updated code, we imported useEffect and useState hooks from React, along with the fetchAllNews function. Next, we defined a data state to hold the fetched news data and setData to update it. Additionally, we added two other states, isLoading and isError, along with their corresponding setters to handle the loading state and potential errors during the API call.

 

Then we used the useEffect hook to manage the side effect of fetching data from the API. Within the hook, we defined an asynchronous function, getNews, that sets isLoading to true and isError to false initially. We then used the fetchAllNews function, passing 1 as the page number to retrieve the first page of news data. In case of any error during the API call, isError is set to true. Finally, regardless of success or failure, isLoading is set to false to indicate that the loading process is complete. The getNews function is executed within the useEffect hook, and an empty dependency array ensures that the effect runs only once, immediately after the page is rendered initially. We then passed the data, isLoading, and isError states, along with null for the error prop, to the corresponding props (data, isLoading, isError, and error) of the NewsShow component.

Next, we will update NewsShow.js which is inside the components directory.Here, we updated the NewsShow component to accept the data, isLoading, isError and error as props. Then we added a condition to check if isLoading is true. If it is true we return the text “Loading…..”  to inform the user that content is being fetched. After that we updated the return statement to check if results inside data exist. If it doesn’t exist we assign it to an empty array. We then use the map method to loop through the list of news items. For each news item, a <div> element is created with the id of the news item as its key. Clicking on this <div> opens the news article link in a new tab, utilizing the window.open function with the _blank attribute to open the link in a new tab. 

Inside each <div> element, an <img> element is used to display the news article’s image. If the image_url is available, it is set as the source of the image. If the image_url is not available, a placeholder image (noimage.jpg) is displayed instead.Download a placeholder image and store it in the public directory which is inside the project’s directory. I download the placeholder image from here

The <div> elements are also styled to be responsive to change in screen size, with a hover effect that slightly scales the elements and adds a shadow.

Let’s test our application. Make sure the Django server is running, then run the React application: 

npm start

If you check the UI on your browser, you should not get any news yet. Instead you should get a CORS error. 

Open the DevTools and monitor Fetch/XHR after reloading, you should see the CORS error on the fetch status.

This is because we haven’t configured CORS in our backend. To configure CORS, switch to the root directory (news-aggregator), make sure virtual environment is activated. Then install Django corheaders:

pip install django-cors-headers 

 

After that open settings.py inside the config directory and add the settings for CORS. Make sure the CorsMiddleware is before the CommonMiddleware.

Also add http://localhost:3000  to CORS origin whitelist.

You can learn more about CORS at: 

https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS   

Now you can restart the Django server and React app. Then reload the page, you should see the news items from our database there. But it shows news items from page 1 only as we have hard coded our React app to show news from the first page only. So now, let’s work on pagination. Make the following changes to your code in NewsPage.js file:

Here, we defined the state variables for page and totalPages to track the current page and total number of pages. Then we updated the fetchAllNews API call to accept the page parameter and calculate the total pages from the returned data, setting it in totalPages. After that we added functions to handle navigation: handlePrevious, handleNext, handleFirst, and handleLast to update the page state appropriately. In the UI, we included buttons for “First,” “Previous,” “Next,” and “Last,” conditionally enabling or disabling them based on whether previous or next pages are available. Then we displayed the current page number between the navigation buttons, and ensured useEffect fetches new data whenever the page changes. This will add the pagination functionality to our app allowing users to navigate through all the pages available.

Check the page in your browser. You should have a working pagination functionality. 

While reviewing the final state of our application, I found out that the EkantipurSpider scraper was extracting only relative links of the news articles (e.g /photo_feature/2024/11/25/npl-trophy-released-from-dharahara-complex-21-36.html). If you already noticed and fixed it then you’re all set!. But if you haven’t fixed it, you can do so by simply adding this part in the ekantipur.py file:

This will make sure all the relative links are converted to the full link.

 

This concludes the third and final part of our blog series. To summarise what we did, first we built an API using Django Rest Framework to broadcast the news stored in a Postgresql database. Next, we developed a web crawler using Scrapy to gather news articles from two portals, Setopati and Ekantipur, extracting the title, link, and image URL of each news item and storing them in the database. Finally, we created a user interface for the application using React, tying everything together into a functional news aggregator.

 

Next Steps

We have a news aggregation application, but there are so many functionalities you can add to it. For instance, currently we have to manually run our crawler, you could build a scheduling mechanism to automatically run the crawler at fixed intervals such as hourly or daily. Additionally you can also customize the Django admin panel for easier management of news items, including the ability to delete outdated news articles. Furthermore, you can also add more spiders to crawl other sites, categorise the news articles allowing users to browse news by topics such as politics, sports, or entertainment, and so on. These are just a few ideas, the possibilities for expanding and refining your application are virtually limitless!

 

GitHub link to the project repository: 

https://github.com/pdhakal906/news_aggregator