In this blog post, let's see how we could play around with URLs (Uniform Resource Locator) or simply the address to any website found on the internet, within a java program.
Basic Information about the Website
To start with, we'll be needing java.net.* set of classes. Then a URL object to specify the URL we need to play with:
URL urlObject = new URL("https://nandunb.wordpress.com");
Then let's get some of the basic information about the website, like the protocol, the port connected to and it's host:
System.out.println("Protocol: " + url.getProtocol());
System.out.println("Port: " + url.getPort());
System.out.println("Host: " + url.getHost());
Displaying the content of the Website (HTML)
Moving on, to display the content or the HTML content of the website, we'll need to create a URLConnection type object:
URLConnection conn = url.openConnection();
Then we will check if any content is available using getContentLength(), and use an InputStream to read the content:
InputStream urlInput = conn.getInputStream();
while ((c = urlInput.read()) != -1) {
System.out.print((char) c);
}
urlInput.close();
Complete Code
import java.net.*;
import java.io.*;
public class URLProg {
public static void main(String[] args) throws Exception {
URL url = new URL("http://www.google.com");
System.out.println("Basic Information: ");
System.out.println("Protocol: " + url.getProtocol());
System.out.println("Port: " + url.getPort());
System.out.println("Host: " + url.getHost());
URLConnection conn = url.openConnection();
int c;
if (conn.getContentLength() != 0) {
System.out.println("Content: ");
InputStream urlInput = conn.getInputStream();
while ((c = urlInput.read()) != -1) {
System.out.print((char) c);
}
urlInput.close();
} else {
System.out.println("Sorry. No content!");
}
}
}